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PREFACE 


As  an  undergraduate  of  Physics,  I  often  wondered  how  the  profes¬ 
sional  engineers  were  actually  using  computers  to  solve  all  of  those 
intimidating  equations.  Therefore,  it  was  very  satisfying  for  me  to 
come  to  AFIT  and  to  have  the  opportunity  to  study  numerical  analysis 
and  computer  systems.  This  thesis  gave  me  a  chance  to  apply  that  know¬ 
ledge  to  an  important  problem  and  to  make  a  contribution  that  might 
improve  the  capabilities  of  the  Space  Shuttle.  This  project  also 
taught  me  some  new  things  about  matrix  theory,  and  it  was  my  first 
chance  to  work  extensively  with  a  main-frame  scientific  computer.  It 
wasn't  easy,  but  I  am  grateful  for  the  experience. 

Thanks  are  due  to  Captain  James  Hodge  for  his  frequent  assistance 
concerning  the  use  of  the  HEATEST  Program  in  particular  and  the  CYBER 
computer  in  general.  Captain  David  Audley  was  most  helpful  concerning 
the  mathematical  operations  of  the  HEATEST  Program.  Of  course,  the 
overall  guidance  and  patience  of  Dr.  Dennis  Quinn  has  been  invaluable 
to  the  successful  completion  of  this  thesis. 

Finally,  I  would  like  to  apologize  to  my  wife,  Inok,  for  my 
frequent  absence  during  her  first  18  months  of  residence  in  the  United 


States. 
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ABSTRACT 
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M 

A  computer  program  named  HEATEST  required  excessive  computer  time 
to  evaluate  the  matrix  Riccati  equation  for  temperature  covariance. 
Alternative  numerical  methods  were  employed  to  compute  the  Riccati 
equation,  and  the  HEATEST  program  execution  time  was  reduced  by  70%. 
However,  cumulative  temperature  covariance  rose  from  2.45  to  3.28  degrees. 
This  rise  was  considered  insignificant. 

A  survey  was  conducted  of  methods  for  computing  the  matrix  exponen¬ 
tial.  A  triangular  matrix  decomposition  method  proved  to  be  more  efficient 
than  summing  the  Taylor  series,  especially  for  matrices  with  a  large 
condition  number.  This  substitution  produced  an  overall  10%  decrease 
in  HEATEST  execution  time  with  comparable  accuracy. 

Simpson's  Rule  was  used  to  evaluate  the  matrix  Riccati  integral 
term.  The  accuracy  of  this  method  was  in  the  range  of  5  to  9  significant 
digits,  and  computation  time  for  the  integral  term  was  reduced  by  90% 
for  a  matrix  of  order  13.  This  substitution  prompted  the  rise  in  the 
covariance.  „ 

FORTRAN  program  modules  and  numerical  examples  are  included. 


I.  INTRODUCTION 


PROBLEM  DESCRIPTION 

This  thesis  attempts  to  solve  one  aspect  of  the  problem  of  deter¬ 
mining  allowable  limits  for  the  reentry  trajectory  envelope  of  the  Space 
Transportation  System  (STS  or  Space  Shuttle).  The  Air  Force  Flight  Test 
Center  developed  a  strategy  to  determine  this  reentry  envelope  quantita¬ 
tively.  The  strategy  depended  on  the  use  of  a  computer  program  named 
HEATEST  to  analyze  temperature  data  recorded  during  Shuttle  reentry  and 
to  compute  the  values  of  certain  aerothermodynamic  parameters  by  means 
of  an  iterative  estimation  technique. 

The  HEATEST  program  was  used  successfully  to  process  thermal  data 
from  wind  tunnels,  the  Shuttle  simulator,  and  a  reentry  maneuver  during 
the  first  Shuttle  test  flight.  The  major  drawback  was  that  the  program 
required  large  amounts  of  expensive  computer  processor  time.  According 
t"  ’lodge,  1982,  the  Air  Force  Flight  Test  Center  expended  a  substantial 
portion  of  a  $50,000  budget  on  computer  time  to  operate  the  HEATEST  pro¬ 
gram  during  the  first  year  after  its  development. 

Computer  experts  at  Edwards  AFB  used  a  software  profiling  tool 
to  determine  that  most  of  the  processing  time  was  absorbed  in  performing 

matrix  multiplications  while  evaluating  this  matrix  Riccati  equation: 
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=  a  tridiagonal  system  matrix 
=  the  transpose  of  the  A  matrix 
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tn-l  =  ”'ncrement  of  heat  propagation 

=  a  dummy  time  variable 

=  a  system  error  matrix 

=  a  priori  value  of  propagated  temperature 
covariance  at  time  t 

n 

=  a  posteriori  value  of  covariance  at  time 


The  required  computation  time  increased  rapidly  as  a  function  of 
the  size  of  the  matrices.  For  practical  time  limits,  the  size  of  the 
system  matrix  A  was  limited  to  order  13  (representing  13  temperature 
nodes),  which  effectively  modelled  only  the  outer  inch  of  the  Shuttle 
thermal  protective  system  (TPS).  Subsequent  Shuttle  missions  will  per¬ 
form  test  maneuvers  at  a  later  time  in  the  reentry  flight  profile  when 
heat  will  have  penetrated  deeper  into  the  TPS,  so  it  would  be  desirable 
to  process  more  nodes  (and  larger  matrices)  to  adequately  model  the 
true  Shuttle  heat  parameters.  This  problem  has  a  particular  urgency  at 
the  time  of  this  thesis,  because  planned  Shuttle  landings  at  Vandenburg 
AFB  will  require  trajectories  with  increased  heat  stress. 

OBJECTIVES 


The  main  objective  of  this  project  was  to  investigate  methods 
for  improving  the  performance  of  the  HEATEST  program.  Specifically, 
the  project  had  the  following  goals: 

A.  To  make  a  baseline  performance  measurement  of  the  numerical 
method  originally  used  to  calculate  the  exponential  matrix  e^. 


The  required  processor  time  was  to  be  measured  as  a  function  of  the 
order  of  matrix  A. 

B.  To  make  a  literature  search  of  other  methods  for  computing 
the  exponential  matrix. 

C.  To  implement  alternative  numerical  methods  on  the  AFIT 
CYBER  computer  in  the  Fortran  5  language,  and  to  assess  their  relative 
performance. 

D.  Hopefully,  to  discover  an  improved  computation  method,  and 
to  incorporate  that  technique  into  the  HEATEST  program. 

E.  To  measure  the  extent  of  any  improvement  in  overall  HEATEST 
program  execution  time  which  is  produced  by  improved  computation  of  the 
matrix  exponential. 

F.  To  use  standard  software  engineering  practices  when  modifying 
the  HEATEST  program.  These  are  to  include  modular  design,  top-down 
organization,  and  comprehensive  documentation  including  source  references. 

The  original  goals  were  refined  and  extended  as  the  research 
progressed.  The  literature  search  resulted  in  one  very  promising  approach 
to  the  computation  of  the  matrix  exponential.  The  investigation  of  less 
favorable  alternatives  was  sacrificed  in  order  to  devote  full  effort 
toward  implementing  a  triangular  matrix  decomposition  technique.  This 
technique  then  prompted  an  idea  for  efficiently  approximating  the  integral 
term  of  the  Riccati  equation  by  means  of  Simpson's  rule.  These  numerical 
methods  were  ultimately  incorporated  into  the  HEATEST  program  with  great 
success  at  reducing  overall  program  execution  time. 


INITIAL  RESEARCH  STRATEGY 


One  possible  solution  might  be  to  obtain  an  array  processor,  a 
computer  logic  unit  whose  architecture  is  optimized  toward  the  efficient 
handling  of  matrix  computations.  This  would  not  be  very  convenient  and, 
more  importantly,  the  rapid  increase  of  processing  time  in  response  to 
matrix  size  suggests  that  more  efficient  numerical  methods  would  be  a 
better  solution. 

Equation  (1)  is  seen  to  consist  of  two  terms,  each  of  which  is 
computed  in  the  HEATEST  program  by  summing  a  separate  Taylor  series 
expansion.  Furthermore,  these  series  must  be  recomputed  over  many 
subintervals  of  time  in  order  to  guarantee  mathematical  stability  and 
accuracy.  The  time  consuming  matrix  multiplication  was  necessary  for 
both  Taylor  series  program  modules,  which  are  named  MEXP  and  INTEG, 
respectively.  This  author  chose  to  investigate  alternative  methods  for 

a  f 

computing  the  matrix  exponential  e  in  an  attempt  to  produce  a  better 
MEXP  routine. 

The  following  methods  were  intially  proposed,  to  be  supplemented 
later  by  methods  from  a  literature  search: 

A.  Summing  a  smaller  of  terms  in  the  Taylor  series. 

B.  Taking  advantage  of  the  assumed  symmetric  nature  of  the  A 
matrix  in  order  to  convert  it  to  a  similar  diagonal  form. 

C.  Developing  a  specialized  matrix  multiplication  algorithm  which 
is  optimized  toward  tridiagonal  matrices  in  order  to  reduce  the  required 
number  of  arithmetic  operations. 

D.  Converting  the  A  matrix  to  a  similar  triangular  form,  and 
developing  a  multiplication  algorithm  which  is  optimized  toward  that 


These  possibilities  will  be  examined  in  the  next  two  chapters  of 
this  thesis.  The  information  in  this  chapter  was  drawn  from  Hodge, 
Phillips,  and  Audley,  1981  and  from  Hodge,  1982. 


II.  SURVEY  OF  COMPUTATION  METHODS  FOR  THE  MATRIX  EXPONENTIAL 

INTRODUCTION 

At 

Computation  of  the  matrix  exponential  e  is  an  important  general 
mathematical  problem  with  wide  application  to  several  types  of  physical 
processes.  Many  mathematical  models  involve  systems  of  linear,  constant 
coefficient  ordinary  differential  equations  of  the  form 

x1 (t)  =  Ax(t) 

where  A  is  a  coefficient  matrix 
x(t)  is  a  solution  vector 
x(0)  =  x Q  is  the  initial  condition  vector 
x'  represents  the  first  derivative  of  x 
The  solution  to  this  equation  is  given  by 

x ( t )  =  eAtxQ. 

This  review  of  computation  techniques  effectively  begins  and  ends 
with  the  comprehensive  survey  by  Moler  and  Van  Loan,  1978.  That  survey 
evaluated  19  different  practical  methods  for  computing  the  matrix  exponen 
tial.  The  methods  were  grouped  into  five  general  classes  and  their  rela¬ 
tive  advantages  were  assessed.  The  methods  in  the  survey  are  oriented 
toward  the  solution  of  matrices  A  whose  order  n  is  less  than  a  few 
hundred,  so  the  constituent  elements  can  be  readily  stored  in  the  primary 
memory  of  contemporary  computers. 

Other  literature  sources  were  examined  from  the  references  quoted 
by  Moler  and  Van  Loan,  1978,  from  cross  references,  and  from  articles 
listed  in  the  Government  Records  Annual  Index  (GRAI).  However,  many  of 
these  other  papers  specifically  referred  to  the  larger  survey  (see,  for 


instance:  Van  Loan,  1977:981  and  Ward,  1977:610)  or  else  were  directly 
referenced  by  Moler  and  Van  Loan,  1978.  No  other  article  was  found 
which  contained  either  a  comparison  of  methods  or  which  had  a  comparable 
discussion  of  practical  computer  implementation  considerations. 

The  remainder  of  this  section  will  essentially  consist  of  selected 
extracts  from  Moler  and  Van  Loan,  1978.  Unless  otherwise  indicated,  every 
passage  in  quotation  marks  is  from  this  source.  The  most  interesting 
methods  will  be  presented  and  critiqued  as  to  their  usefulness  in 
solving  the  problem  at  hand. 


SERIES  METHODS 

An  algorithm  can  be  immediately  developed  from  the  Taylor  series 
expansion 

eA  =  I  +  A  +  P?'->\  +  ...  (2) 


One  straightforward  approach  would  be  to  sum  the  series  by  adding  terms 

until  the  limit  of  machine  precision  is  exceeded.  Unfortunately,  this 

naive  approach  has  a  serious  flaw  which  makes  it  useful  only  for  setting 

a  lower  bound  on  the  efficiency  of  calculations. 

The  matrix  exponential  has  a  property  which  every  algorithm  must 

At 

overcome.  As  t  increases  the  elements  of  e  may  grow  before  they  decay. 
This  is  true  to  some  extent  for  any  nonsymmetric  matrix.  This  results 
in  a  hump  which  is  displayed  in  Figure  2.  An  example  will  show  why  this 
is  a  problem.  "Take  the  input 


A  =  P-49  24  1  =  T 1  3  1  (~-l  0  "I  I~1  3*  -1 

_-64  31  L2  4  J  .  0  "17_  _2  4  . 


(3) 


-5  -6 

Using  a  machine  precision  of  16  =  .95  x  10  ,  the  first  59  terms  m 


7 


the  Taylor  series  expansion  are  summed  to  obtain  the  output 

eA  =  -22.25880  -1.432766 

_-61. 49931  -3.474280 

Analytic  calculation  of 


gives  an  output  to  six  decimal  places  of 


(4) 


(5) 


(6) 


which  is  not  even  close  to  the  first  solution.  The  problem  is  with  the 
intermediate  results  A^/16!  and  A^/17!  which  have  elements  greater  than 
106  in  magnitude  but  of  opposite  sign.  Therefore,  the  intermediate  elements 
have  absolute  errors  which  are  larger  than  the  final  result.  It  is  impor¬ 
tant  to  realize  that  the  problem  is  with  arithmetic  truncation  and  not 
with  the  series  truncation." 

The  matrix  exponential  can  also  be  calculated  with  a  series  of  PADE 
rational  approximants  or  continued  partial  fractions.  Moler  and  Van  Loan, 
1978:808  state  that  "roundoff  error  makes  PADE  approximants  unreliable." 
Furthermore,  difficulties  with  matrix  inversion  make  this  method  inaccurate 
"when  the  A  matrix  has  widely  spread  eigenvalues."  This  is  exactly  the 
type  of  matrix  that  is  generated  by  the  HEATEST  program. 

The  most  sophisticated  method  for  using  a  series  approximation 
is  called  "scaling  and  squaring".  This  is  the  same  time-consuming  method 


that  was  originally  employed  in  HEATEST  in  the  module  MEXP.  "Roundoff 
error  and  computing  costs  tend  to  increase  as  1 1 |a 1 1 ,  or  the  spread  of 


the  eigenvalues,  increases."  This  is  controlled  in  scaling  and  squaring 
by  "exploiting  the  unique  exponential  property  eA  =  (e^m)m." 

"The  idea  is  to  choose  m  to  be  a  power  of  two  for  which  eA^m  can 
be  reliably  computed,  and  then  to  form  (e^m)m  by  repeated  squaring.  One 
common  criterion  is  to  choose  m  such  that  ||a||  /msl."  The  HEATEST  module 
MEXP  tries  to  save  a  step  by  choosing  m  such  that  ||A  1 1 /ms3 .  This  technique 
tends  to  keep  down  the  hump.  "The  scaling  and  squaring  algorithm  is  one 
of  the  most  accurate  that  we  know." 

POLYNOMIAL  METHODS 

The  matrix  exponential  has  many  interesting  polynomial  decomposi¬ 
tions.  Unfortunately,  they  contain  serious  problems  for  computational 
purposes. 

CAYLEY-HAMILTON  THEOREM:  Any  function  of  A  can  be  expressed  in 
terms  of  a  polynomial  in  I,A,...,  An_1.  "This  implies  that  eAt  can  be 
expressed  as  a  polynomial  in  A  with  analytic  coefficients  in  t: 
n-1 

eAt  =  a-j (t)A^  "  (7) 

j=0  J 

Moler  and  Van  Loan,  1978:816  describe  a  means  of  generating  the  func¬ 
tions  a.(t).  However,  this  method  requires  a  priori  knowledge  of  the 

J 

characteristic  polynomial  of  A,  is  affected  by  roundoff  error,  and  is 
structurally  similar  to  the  naive  form  of  the  Taylor  series. 

LAGRANGE  INTERPOLATION,  NEWTON  INTERPOLATION,  and  the  VANDERMONDE 
At 

MATRIX:  e  can  be  expressed  in  terms  of  each  of  these  well-known 

formulas.  However,  these  mthods  require  a  priori  knowledge  of  the  eigen- 

4 

values  of  A,  they  are  "algorithms  of  the  order  of  n  operations  (pro¬ 
hibitive  except  for  small  n),"  and  they  suffer  problems  when  the 


3 


eigenvalues  are  nearly  equal  (confluent). 

INVERSE  LAPLACE  TRANSFORMS:  These  "inverse  transforms  can  be 
expressed  as  a  power  series  in  t."  Other  techniques  can  also  be  used  such 
as  Heaviside  expansion.  "These  methods  are  also  O(n^)  and  are  affected 
by  roundoff  error." 

ORDINARY  DIFFERENTIAL  EQUATION  METHODS 
At 

"Since  e  is  a  solution  to  ordinary  differential  equations,  it 
is  possible  to  consider  methods  based  on  numerical  integration."  Many 
powerful  computer  programs  have  been  developed  for  solving  O.D.E.'s. 
"Methods  based  on  single-step  formulas,  multi-step  formulas  and  variable 
step  size"  all  have  two  features  in  common:  they  are  relatively  "easy 
to  use",  and  they  are  "relatively  time-consuming."  "The  O.D.E.  programs 
are  designed  to  solve  a  single  initial  value  system 

x'  =  f(x,t)  with  x(0)  =  Xq 

and  to  obtain  the  solution  at  many  values  of  t." 

General  O.D.E.  solvers  all  suffer  from  "not  taking  advantage  of 
the  linear,  constant  coefficient  nature"  of  the  matrix  exponential 
problem.  Instead,  they  traverse  "a  sequence  of  values  0  =  tg.tj,..., 
tn  =  t."  Moler  and  Van  Loan,  1978:813  test  three  published  programs 
that  are  general  purpose  O.D.E.  solvers.  The  results  show  mainly 
that  these  programs  are  inconsistent  and  "very  inefficient"  for  computing 
the  matrix  exponential  in  their  present  form.  "They  repeatedly  multiply 
various  vectors  by  the  matrix  A  because,  so  far  as  they  know,  it  may 
have  changed  since  the  last  multiplication." 
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MATRIX  DECOMPOSITION  METHODS 


Methods  "based  on  factorizations  or  decompositions  of  the  matrix 

A"  are  "likely  to  be  the  most  efficient  for  problems  involving  the  larger 

At 

matrices  and  for  repeated  evaluations  of  e  If  A  is  symmetric,  then 
these  methods  are  particularly  simple  and  effective. 

"All  of  the  matrix  decompositions  are  based  on  similarity  trans- 

-1  At 

formations  of  the  form  A  =  SBS  .  The  power  series  definition  of  e 

implies  eAt  =  SEBtS"^.  The  idea  is  to  find  an  S  which  is  easy  to  com¬ 
pute.  One  difficulty  is  that  S  may  be  close  to  being  singular  (not 
invertible),  which  means  that  eAt  may  be  difficult  to  compute  accurately. 

DIAGONALIZATION:  "The  naive  approach  is  to  take  S  to  be  the 
matrix  whose  columns  are  eigenvectors  of  A."  Then  we  can  write  AV  =  VD 
where  V  is  the  matrix  of  column  vectors 

D  is  the  matrix  of  diagonal  eigenvalues 
Then  the  exponential  eDt  of  D  is  trivially  computed  by  replacing  each 
eigenvalue  x  in  D  by  e  . 

There  are  several  problems  that  can  arise  when  using  this  method. 
The  first,  "a  theoretical  difficulty,  occurs  when  A  does  not  have  a  com¬ 
plete  set  of  linearly  independent  eigenvectors  and  is  therefore  defective 
In  this  case,  the  method  is  unworkable  because  there  is  no  invertible 
matrix  V  of  eigenvectors." 

The  second  problem,  a  practical  difficulty,  occurs  when  two 
eigenvalues  are  nearly  equal  (confluent),  but  not  exactly  so.  "This  can 
be  illustrated  by  the  example  A  =  x  a 

0  u 
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At 

The  exponential  e  = 

where  (x  -  u)  is  small  but  not  negligible,  computation  of  the  divided 
difference 


X  -  u 


can  result  in  a  large  relative  error,"  especially  when  multiplied  by 
large  a. 

A  third  difficulty  arises  from  calculation  of  V”*.  This  is  another 
source  of  inaccuracy  when  using  this  naive  method  to  compute  the  matrix 
exponential . 

JORDAN  CANONICAL  FORM  (JCF):  "In  principle,  the  problem  caused 
by  defective  systems  of  eigenvectors  can  be  solved  by  using  a  similarity 
transformation  to  the  Jordan  canonical  form."  The  JCF  is  ?,  special  form 
of  matrix  which  consists  of  all  zero's  except  for  square  blocks  of 
non-zero  numbers  along  the  main  diagonal.  These  Jordan  blocks  reflect 
the  vector  subspaces  that  define  the  similarity  class  to  which  the  matrix 
belongs.  "The  exponential  of  each  Jordan  block  can  be  calculated  in 
closed  form,  and  eAt  =  PeJtP_1,  where  J  *  J^...,  Jn" ,  a  concatenation 
of  Jordan  block  submatrices. 

"For  example. 


(9) 


0  X.  1  0 

0  0  1 

0  0  0  x. 


.At 


axt  ut 
e  -  e 

x  -  u 
_ut 


2 


then  e Jit  =  eAit 


The  difficulty  with  this  method  is  that  it  cannot  be  computed  using 
floating  point  arithmetic.  A  single  rounding  error  at  the  limit  of 
machine  precision  may  cause  a  multiple  eigenvalue  to  become  distinct 
or  vice  versa,  which  would  alter  the  structure  of  J  and  P."  For  example, 
two  matrices 


will  have  different  Jordan  forms  for  any  non-zero  error  e,  no  matter 
how  small . 

SCHUR  DECOMPOSITION:  A  more  reliable  method  is  to  use  the  so- 
called  Schur  transformation  A  =  UTU"1,  where  U  is  a  unitary  (orthogonal) 
matrix  and  "T  is  a  triangular  matrix  that  will  always  exist  if  A  has 
real  eigenvalues.  If  A  has  complex  eigenvalues,  then  T  will  be  quasi- 
triangular  with  2-by-2  blocks  on  the  main  diagonal."  The  e  *  =  Ue^U”*. 
This  method  is  especially  convenient  because  of  the  well-known  property 
that  the  orthogonality  of  U  implies  that  the  inverse  matrix  lT*  is  equal 
to  the  transpose  matrix  U^. 

The  only  difficulty  occurs  when  the  matrix  T  has  eigenvalues  that 
are  nearly  confluent,  which  can  induce  a  magnification  of  roundoff  error 
when  computing  e^.  "This  is  a  general  problem  for  all  matrix  decomposi¬ 
tions  of  the  form  A  =  SBS"*  and  involves  two  objectives: 


1 

0 

0 

0 


t 

1 

0 
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1 
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A.  Make  B  close  to  diagonal  so  that  e  is  especially  easy  to 
compute."  The  Jordan  form  epitomizes  this  objective. 

"B.  Make  S  well  conditioned  so  that  computational  errors  will 
not  be  magnified."  This  goal  is  realized  by  the  Schur  method. 

A  practical  compromise  can  be  achieved  between  these  objectives. 
The  idea  is  to  "cluster  the  nearly  confluent  eigenvalues"  into  block 
diagonal  form,  using  a  somewhat  non-orthogonal  matrix  S  if  necessary. 

C  .  t 

Then  special  techniques  can  be  used  to  compute  e  i  for  each  block. 

The  simplest  technique  would  be  to  replace  each  eigenvalue  in  the  block 
by  the  average  of  all  those  in  the  block.  The  best  criterion  for  cluster 
ing  eigenvalues  remains  to  be  determined. 

SPLITTING  METHODS 

Moler  and  Van  Loan  discuss  a  special  characteristic  of  the  matrix 

exponential  and  present  the  formula  eA  =  (eB'/meC'/n1)m.  "This  approach  to 

A  B  C 

computing  e  has  potential  interest  when  the  exponentials  e  and  e 

can  be  accurately  and  easily  computed."  Then  it  might  be  convenient 

to  split  A  =  B  +  C. 

"The  efficiency  of  this  technique  is  somewhat  difficult  to  assess 
because  it  depends  strongly  on  the  scalar"  elements  of  A.  Under  general 
conditions,  the  splitting  method  is  considered  to  be  much  less  efficient 
than  simply  scaling  and  squaring  the  Taylor  series. 

OVERALL, EVALUATION  OF  THE  METHODS 

A  large  segment  of  the  available  project  time  was  dedicated  to 
studying  the  survey  by  Moler  and  Van  Loan,  1978,  and  a  selection  of  the 
numerous  papers  that  are  referenced  by  them.  For  purposes  of  this  thesis 
the  most  attractive  computational  methods  were  extracted  from  the  survey. 


organized  as  presented  above,  and  evaluated  for  application  to  the 
problem  at  hand. 

Polynomial  techniques  were  quickly  discarded  because  they 
require  prior  knowledge  of  the  eigenvalues,  and  because  they  generally 
require  O(n^)  floating  point  operations  for  computer  computations. 

The  splitting  method  is  generally  untried  and  does  not  offer 
any  special  advantage  for  the  purposes  of  the  HEATEST  program. 

Ordinary  differential  equation  solvers  seem  to  have  potential 
for  general  application  to  computing  functions  such  as  the  matrix 
exponential,  and  the  development  of  an  efficient  software  program  could 
possibly  form  the  basis  for  a  dissertation.  That  type  of  investigation 
seems  to  be  beyond  the  scope  of  this  thesis,  however.  Only  series  and 
matrix  decompositions  remain  for  serious  consideration. 

A  statement  is  made  (Moler  and  Van  Loan,  1978 : p .  827)  that  the 
only  generally  competitive  series  method  is  that  of  scaling  and  squaring. 
This  technique  was  the  one  used  in  the  original  HEATEST  implementation, 
calculating  up  to  38  terms  of  the  series  for  each  subinterval  of  time 
for  a  matrix  of  order  n  =  13.  The  extensive  matrix  multiplications 
required  for  computing  this  series  is  the  driving  factor  behind  the 
excessive  use  of  computer  processor  time,  even  though  module  MEXP  uses 
the  Cayley-Hamilton  theorem  to  increase  efficiency. 

Matrix  decomposition  seemed  to  offer  the  best  potential  for  investi¬ 
gation.  The  orthogonal  property  of  the  U  transformation  matrix  allows  the 
inverse  U"*  to  be  easily  obtained  because  it  equals  the  transpose  UT, 
and  it  also  preserves  the  condition  number  (see  Chapter  V  for  definition) 
of  the  matrix  without  magnifying  the  roundoff  errors  that  are  introduced 
by  exponentiation. 
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A  decision  was  made  to  investigate  the  Schur  technique  for  matrix 
decomposition.  At  the  same  time,  the  form  of  the  A  matrix  could  be 
examined  for  possible  ways  of  developing  a  specialized  multiplication 
algorithm  which  could  be  used  to  more  efficiently  calculate  the  original 
series  summation.  This  investigation  and  program  development  will  be 
detailed  in  the  next  section. 
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COMPUTING  THE  MATRIX  EXPONENTIAL  (DEVELOPMENT  OF  MEXP2) 


THEORY 

At 

The  HEATEST  program  uses  a  module  named  MEXP  to  compute  e  by 
means  of  the  scaling  and  squaring  method.  MEXP  and  its  support  modules 
are  listed  in  Appendix  A,  along  with  a  structure  chart  which  diagrams 
their  interaction.  The  first  efforts  to  develop  an  improved  module,  MEXP2, 
were  based  on  the  idea  of  a  more  efficient  matrix  multiplication  algorithm 
which  would  decrease  the  computation  time  by  a  linear  factor.  This 
required  an  examination  of  the  characteristics  that  define  the  typical 
A  matrix  to  be  processed. 

A  random  system  matrix  of  order  10  was  obtained  from  HEATEST 
and  used  as  a  test  case  for  investigation.  This  matrix  is  displayed  in 
Figure  3  and  is  seen  to  have  the  following  properties: 

A.  Tridiagonal  form. 

B.  Not  symmetric,  but  nearly  so. 

C„  Diagonally  dominant.  This  means  that  each  diagonal  value 
is  greater  than  the  sum  of  all  other  numbers  in  that  row  or  column. 

D.  All  entries  in  the  main  diagonal  have  negative  sign,  while 
all  entries  in  the  off-diagonal  rows  have  positive  sign. 

E.  A  wide  range  of  values. 

These  properties  will  be  seen  to  have  great  significance. 

The  first  consideration  was  to  attempt  to  exploit  the  inherent 
sparseness  of  the  tridiagonal  matrix.  Several  published  computer  codes 
are  already  available  for  manipulating  tridiagonal  matrices  and  for 
storing  the  non-zero  values  in  a  reduced  form  which  saves  on  primary 
computer  memory.  One  source  is  the  International  Mathematics  and 


7 


Statistics  Library  (IMSL).  This  idea  proved  not  to  be  useful  for  at 
least  two  reasons.  First  of  all,  the  reduced  form  of  storage  would 
not  help  unless  implemented  throughout  the  entire  HEATEST  program,  which 
was  not  feasible  in  the  time  available.  More  importantly,  the  tri¬ 
diagonal  form  is  not  closed  under  the  operation  of  matrix  multiplication, 
as  shown  in  Figure  4.  It  can  be  seen  that  the  resulting  matrix  bandwidth 
increases  by  two  off-diagonal  rows  after  each  operation.  This  property 
is  not  conducive  to  a  specialized  algorithm  for  calculating  series 
summations . 

A  more  promising  candidate  would  be  the  triangular  form  shown 
in  Figure  5,  which  is  closed  under  matrix  multiplication.  The  sparse¬ 
ness  of  the  triangular  form  can  easily  be  exploited,  as  seen  in  Figure 
7.  The  full  formula  for  upper  triangular  multiplication  is  shown  in 
Figure  6,  and  the  resulting  non-zero  terms  are  shown  in  Figure  7.  It 
would  not  be  difficult  to  devise  a  computer  code  tnat  would  only  include 
the  resulting  non-zero  terms  in  its  computations.  The  efficiency  of 
this  technique  could  be  significant.  By  looking  at  diagonals,  one 
can  see  by  inspection  that  the  number  of  resulting  non-zero  element 
multiplications  can  be  described  as 

n 

1 (n)  +  2(n  -  1)  +  ...  +  n(l)  =  ^  K(n  +  1  -  K) 

K=1  n  n 

■  <"  +  >>X>-X>2  in) 

K=1  K=1 

=  (n  +  l)n(n  +  1)  -  n(2n  +  l)(n  +  1)  =  n(n  +  l)[3(n  +  1)  -  (2n  +  1) 

6  6  6 

=  +  2).  This  sum  has  the  value  ,22{n)  for  n  =  10,  and  it 

6  3 

approaches  the  value  .16(n  )  as  n  increases.  Compare  this  to  the  usual 

3 

matrix  multiplication  which  requires  n  operations. 
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The  algorithm  above  appears  to  be  a  promising  technique  for 
efficient  computation  of  triangular  series  summations.  A  triangular 
scaling  and  squaring  series  method  could  be  used  to  evaluate  e^  for 
a  matrix  T  which  is  obtained  from  the  Schur  decomposition  of  A  =  UTU”1. 
This  interesting  innovation  was  never  tested,  however,  due  to  a  lack 
of  project  time  and  the  successful  implementation  of  an  even  more 
efficient  method. 

Moler  and  Van  Loan,  1978:823,  recommend  using  a  different  technique 
which  can  calculate  any  analytic  function  of  a  triangular  matrix.  This 
method  is  described  in  Parlett,  1976,  and  Fortran  language  program 
modules  are  available  from  Parlett,  1974.  This  method  is  based  on  a 
recurrence  relation  that  exists  among  the  elements  of  the  triangular 
function  matrix  F,  where  F  is  obtained  by  using  the  algorithm  of 
Equations  (12)  and  (13). 

A  lemma  is  presented  (Parlett,  1976:118)  that  if  T  is  a  block 

upper  triangular  matrix,  then  a  function  matrix  F  =  f(T)  is  also  of 

block  upper  triangular  form  with  the  same  block  structure.  The  method 

starts  by  first  computing  the  diagonal  blocks  (or  elements)  of  F  from 

Fpr  =.exp(Trr).  Then  the  upper  triangular  blocks  (elements)  can  be 

determined  from  the  formula 

s-r-1 

TrrFrs  "  Frs^ss  ~  ^  1  ^r,r+k  ^r+k,s  ’  ^r,s-k  %-k.s)  ^or  r<s 
k=0 

for  successive  diagonals,  moving  in  turn  away  from  the  main  diagonal. 

Each  block  F^s  is  determined  as  a  linear  combination  of  lower  diagonal 
blocks  which  are  in  row  r  and  column  s  of  the  function  matrix  F.  Then 
the  right  side  of  the  equation  is  known,  and  the  elements  of  block  Fr$ 


are  obtained  by  solving  a  set  of  simultaneous  equations 

TrrRrs  "  FrsTss  ~  R 

Three  cases  can  occur: 

A.  The  Schur  matrix  is  strictly  triangular  and  the  eigenvalues 
(diagonal  elements)  are  all  distinct.  This  is  the  simplest  case,  and 
the  calculations  are  straightforward.  A  Fortran  program  module  is- 
supplied  by  Parlett,  1974:15. 

B.  The  Schur  matrix  T  has  quasitriangular  form  with  at  least 
one  2-by-2  block  on  the  main  diagonal,  and  the  eigenvalues  are  distinct. 
This  case  is  illustrated  in  Figure  8.  The  computation  of  F  in  this  case 
proves  to  be  a  more  complicated  programming  exercise.  The  block  structure 
must  be  discerned,  block  multiplication  must  be  performed  with  variable 
size  blocks,  and  the  resulting  linear  system  equation  must  be  solved 

with  a  possible  need  for  pivoting. 

C.  The  Schur  matrix  T  has  confluent  eigenvalues,  in  particular 
when  T^r  =  Ts$.  In  this  situation,  the  basic  recurrence  formula  breaks 
down  and  an  alternative  must  be  used.  In  addition,  calculations  in  finite 
precision  arithmetic  can  return  inaccurate  results  when  Trr  is  very  close, 
but  not  exactly  equal,  to  Tss.  This  case  can  be  treated  as  follows: 

1.  Eigenvalues  that  are  very  close  can  be  replaced  by  a  confluent 
eigenvalue  which  is  equal  to  the  average  of  the  similar  eigenvalues.  The 
menaing  of  "very  close"  is  not  specifically  defined. 

2.  Confluent  values  are  treated  with  an  alternate  formula 
(Parlett,  1974:3)  which  is  derived  from  Newton's  form  of  the  interpolating 
polynomial  and  which  is  manipulated  to  obtain  Frg. 

3.  Then  the  general  formula  (13)  can  be  applied.  This  case 


20 


4 

requires  on  the  order  of  n  floating  point  arithmetic  operations.  A 
Fortran  program  module  is  supplied  for  this  case,  also  (Parlett,  1974:16). 

IMPLEMENTATION 

The  development  process  was  started  by  generating  a  baseline 
example.  Under  usual  software  development  conditions,  the  most  efficient 
strategy  is  to  use  the  main  program  as  the  driver  for  new  submodules. 
However,  HEATEST  was  too  large,  in  both  memory  and  required  execution 
time,  to  be  operated  from  an  interactive  CYBER  computer  terminal  with 
the  desired  immediate  turnaround  time.  Therefore,  the  sample  matrix 
in  Figure  3  was  obtained  from  HEATEST  and  used  as  a  test  case.  A  separate 
test  program  was  used  to  compute  the  exponential  of  the  test  matrix 
by  means  of  the  original  MEXP  module  and  its  supporting  submodules. 

The  result  (Figure  9)  was  used  as  a  baseline  reference  for  future  com¬ 
parison  to  ensure  accurate  function  computation.  However,  at  this  point, 
there  was  still  some  uncertainty  as  to  the  degree  of  accuracy  that  was 
returned  by  MEXP. 

The  next  step  was  to  generate  the  matrix  exponential  by  using 
Parlett's  algorithm.  This  required  the  following  procedure: 

A.  Generate  a  Schur  orthogonal  transformation  matrix  U  such  that 
A  =  UTU~1  where  T  is  of  upper  triangular  or  quasitriangular  form. 

B.  Determine  the  eigenvalue  structure  of  the  matrix  T. 

C.  Apply  the  correct  treatment  of  Parlett's  algorithm  to  obtain 
F  =  exp(T) . 

A  -1 

D.  Calculate  e  =  UFU  and  print  the  result  for  comparison. 

These  steps  will  be  discussed  in  turn. 
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It  is  recommended  (Moler  and  Van  Loan,  1978:823)  to  use  Fortran 
modules  ORTHES  and  HQR  from  EISPACK,  1976  to  calculate  the  transforma¬ 
tion  matrix  Q.  ORTHES  is  used  to  transform  any  general  matrix  into 
upper  Hessenburg  form,  in  which  the  first  subdiagonal  is  the  only  non¬ 
zero  lower  diagonal  row.  This  was  not  required  for  purposes  of  HEATEST, 
since  the  tridiagonal  form  of  the  A  matrix  is  effectively  a  subcategory  of 
the  upper  Hessenburg  form.  The  HQR  module  acts  on  the  Hessenburg  form 
to  obtain  an  orthogonal  U  matrix  by  the  iterative  QR  algorithm.  The 
EISPACK  HQR  module  was  modified  slightly  by  the  author  of  this  thesis 
to  eliminate  additional  accumulations  of  the  eigenvalues  of  A.  This 
reduced  the  required  number  of  input/output  parameter  arrays  for  HQR, 
which  in  turn  reduced  the  amount  of  primary  computer  memory  needed  to 
operate  the  module.  For  the  test  matrix  in  Figure  3,  HQR  generated  the 
transformation  matrix  U  in  Figure  10.  It  is  well-known  that  for  a  given 
orthogonal  matrix  U,  the  inverse  If1  is  just  the  transpose  UT.  The 
matrix  in  Figure  10  was  multiplied  by  its  transpose  to  verify  orthogonality 
and,  in  fact,  the  identity  matrix  was  the  result. 

The  next  step  in  the  exponentiation  procedure  is  dependent  on  the 
eigenvalue  structure  of  the  matrix  T.  The  test  matrix  in  Figure  3  was 
transformed  to  the  triangular  matrix  in  Figure  11.  The  main  diagonal 
elements  are  the  eigenvalues,  which  are  seen  to  be  all  non-zero,  real, 
and  distinct.  The  wide  spectrum  of  the  eigenvalues  indicates  a  poor 
condition  of  the  matrix  for  computational  purposes. 

This  is  all  very  fine  for  the  test  case,  but  a  vital  question 
remains  as  to  whether  every  A  matrix  generated  by  HEATEST  will  exhibit 
a  similar  eigenvalue  structure.  One  basis  for  conclusion  would  be  to 
evaluate  all  the  system  matrices  that  are  generated  during  one  reentry 
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flight.  If  these  matrices  all  had  the  same  structure,  then  an  argument 
could  be  made  to  project  that  result  to  the  general  case.  However,  Hodge, 
1982,  recommended  an  analysis  of  the  generating  equations  in  the  hope  of 
mathematically  proving  a  general  result.  The  following  paragraph  presents 
an  analysis  which  establishes  this  result,  except  for  the  confluence  of 
eigenvalues. 

HEATEST  generates  the  A  matrix  according  to  the  system  of  equations 
in  Figure  12  (Hodge,  Phillips,  and  Audley,  1981:4)  which  describe  the 
energy  balance  of  the  heat  propagation  in  the  Shuttle  Thermal  Protective 
System  (TPS).  The  first  equation  is  of  fourth  order  and  describes  the 
interactive  heat  transfer  at  the  surface  of  the  TPS.  This  equation  is 
responsible  for  the  non-symmetric  character  of  the  system  matrices.  A 
cursory  analysis  of  the  energy  balance  equations  will  reveal  that  every 
A  matrix  generated  by  HEATEST  will  be  both  tridiagonal  and  diagonally 
dominant.  Furthermore,  the  elements  of  the  main  diagonal  will  all  have 
the  same  sign,  and  the  elements  of  the  two  off-diagonal  rows  will  all 
have  the  same  opposite  sign.  Diagonal  dominance  ensures  that  none  of 
the  eigenvalues  will  be  equal  to  zero.  Another  theorem  was  found 
(Marcus  and  Mine,  1964:166)  which  proves  that  all  Jacobi  (tridiagonal) 
matrices  with  the  above  characteristics  have  a  spectrum  of  eigenvalues 
which  are  both  real  and  simple  (no  multiple  values). 

The  above  analysis  is  very  useful,  but  a  nebulous  question 
remained  as  to  whether  any  of  the  distinct  eigenvalues  would  be  so 
nearly  confluent  as  to  induce  computational  errors  in  the  calculations. 
Hodge,  1982,  felt  that  the  eigenvalues  would  be  closest  to  each  other 
_  at  the  beginning  of  Shuttle  reentry  when  heat  transfer  is  slight  and 

the  surface  temperatures  are  closest  to  the  reference  values  during 
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on-orbit  conditions.  This  case  was  simulated  by  applying  Parlett’s 
algorithm  to  a  similar,  but  worst  case,  test  matrix  which  was  obtained 
from  a  thin  skin  metal  plate  that  generated  temperature  data  in  a  wind 
tunnel.  This  metal  plate  had  a  uniform  temperature  through  its  cross- 
section,  and  the  system  matrix  was  both  singular  (non-invertible)  and 
not  diagonally  dominant.  This  situation  could  not  occur  within  the  thermal 
shield  itself,  but  it  is  similar  to  the  thin  skin  structure  of  the  Shuttle 
which  is  directly  underneath  the  TPS. 

The  Schur  transformation  of  this  worst  case  matrix  produced  a 
triangular  T  matrix  which  had  one  eigenvalue  equal  to  zero,  but  which 
also  exhibited  a  spread  of  the  eigenvalues  across  the  spectrum.  Based 
on  this  demonstrated  lack  of  confluence,  a  decision  was  made  to  proceed 
by  using  the  simplest  case  (case  A)  of  Parlett's  algorithm,  which  can 
be  applied  even  to  singular  matrices  as  long  as  the  eigenvalues  are 
distinct. 

The  module  FUNCT  was  developed  by  this  author  from  subroutine 
FUNUPPD  (Parlett,  1974:16).  A  problem  was  experienced  with  underflow 
when  initially  computing  the  exponential  of  the  diagonal  elements  of  the 
T  matrix.  This  problem  was  solved  by  means  of  an  IF  statement  that 
returns  an  exponential  value  of  0.0  whenever  the  input  diagonal  entry 
has  a  negative  magnitude  larger  than  (-43).  This  guarantees  14  decimal 
places  of  accuracy,  which  is  over  the  limit  of  single  precision  on 
the  CYBER  computer. 

A  module  MEXP2  was  then  developed  by  this  author  which  used 
module  HQR  to  generate  the  Schur  transformation  matrix,  and  then  it 
used  FUNCT  to  calculate  F  =  exp(T)  by  Parlett's  algorithm.  When  used 
on  the  test  matrix  in  Figure  3,  the  matrix  result  of  MEXP2  agreed 


exactly  with  Figure  9,  from  MEXP,  to  the  five  decimal  places  that  were 
displayed. 

RESULTS 

Module  MEXP2  was  incorporated  into  the  HEATEST  program  with  some 
software  difficulties  that  will  be  discussed  in  a  later  section.  HEATEST 
was  executed  over  a  range  of  matrix  dimensions,  using  both  MEXP  and  MEXP2 
for  comparison  as  to  both  accuracy  and  execution  time.  Each  iteration 
of  HEATEST  produced  110  executions  of  the  exponential  matrix  function. 
These  execution  times  were  averaged  and  the  results  are  presented  in 
Table  1  and  Figure  13.  While  returning  comparable  accuracy,  MEXP2  showed 
that  the  Schur  decomposition  method  was  dramatically  faster  than  the 
series  method  of  module  MEXP. 


IV.  COMPUTATION  OF  THE  RICCATI  INTEGRAL  (DEVELOPMENT  OF  INTEG2) 
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THEORY 

The  performance  of  module  MEXP2  demonstrated  that  the  scaling 
and  squaring  series  method  was  a  relatively  inefficient  technique  for 
computing  the  matrix  exponential.  The  module  INTEG  used  a  similar  series 
method  for  computing  the  Riccati  integral,  so  it  seemed  worthwhile  to 
investigate  other  techniques  for  computing  this  operation,  also.  Moler 
and  Van  Loan  reference  a  paper  (Levis,  1969)  which  analyzes  the  compu¬ 
tation  of  this  integral  by  the  method  of  Simpson's  rule. 

Simpson's  rule  is  an  attractive  numerical  method  when  the  speed 
of  computation  is  important,  because  the  rule  exhibits  an  unusual  combin¬ 
ation  of  both  simplicity  and  a  high  degree  of  accuracy.  Simpson's  rule 
approximates  an  integral  by  sections  of  a  parabolic  curve,  rather  than 
by  the  cruder  rectangle  methods.  Yet,  the  simplest  formulation  of  the 
rule  consists  of  a  sum  of  only  three  approximation  terms.  The  accuracy 

of  the  rule  is  also  very  good.  The  computation  error  is  a  function  of 
5 

h  ,  where  h  is  the  interval  between  approximation  points.  To  obtain  a 
higher  degree  of  approximation,  it  would  be  necessary  to  use  a  more 
cumbersome  Newton-Cotes  formula  with  at  least  5  approximation  terms 
(Young  and  Gregory,  1972:369). 

Simpson' s  rule  approximates  the  Riccati  integral 

/n  _At  -A't 

e  Qe  dt  where  A'  =  A  transpose  (14) 

tn-l  At  =  t  -  t  , 

n  n- 1 
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with  the  formula 

V  =  £  [F(0)QF'(0  )  +  4F(h)QF'(h)  +  F (2h)QF* (2h)] 

where  h  =  At/2,  F(t)  =  e"At 
so  we  have 

V  =  »L  [e-A*°Qe-A,,°  +  4  e^V^  +  (2h)] 

3 

or 

v  =  _L  [ IQI  +  4e"At/2Qe"A't/2  +  e'AtQe"At‘] 

2  •  3 

where  I  =  the  identity  matrix 


(15) 


(16) 

(17) 


Finally, 

V  =  |  [Q  +  4e‘At^2Qe“A't/2  +  e_AtQe"At]  (18) 

Calculation  of  the  approximation  formula  for  the  Riccati  integral 

will  require  four  matrix  multiplications  and  four  exponential  calculations. 

In  what  ways  can  this  situation  be  improved?  The  number  of  exponential 

computations  can  be  cut  in  half  by  using  the  well-known  expression 

e”  =  (e  )'.  This  equality  is  a  consequence  of  the  Taylor  series 

expansion  of  the  matrix  exponential. 

The  necessary  calculations  can  be  further  simplfied,  however, 

by  considering  both  terms  in  the  Riccati  equation  (1).  Even  requiring 

only  two  exponential  computations,  the  Simpson's  rule  method  of  computing 

the  Riccati  integral  is  only  comparable  in  effort  to  the  original  Taylor 

series  method  which  summed  the  first  12  terms  in  the  series  expansion. 

Consideration  of  the  entire  Riccati  equation  reveals  that  a  total  of 

-At/2  -At 

three  different  exponential  terms  must  be  evaluated:  e  ,  e  , 
and  eAt.  The  use  of  a  mathematical  formula  which  relates  these  terms 
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could  lead  to  increased  efficiency  in  the  computation  of  Simpson's  rule 
and  of  the  entire  Riccati  equation. 

Consider  the  scalar  exponential  relation  es+^  =  ese*\  Does  this 
relation  also  hold  true  for  the  matrix  exponential?  In  fact,  it  does. 

A  proof  was  found  (Bellman,  1970:170)  which  is  displayed  in  Figure  14. 

It  is  then  possible  to  calculate  the  entire  Riccati  equation  by  making 
only  one  direct  exponential  evaluation,  that  of  the  term  e"^2.  Using 
the  matrix  splitting  formula  es+t  =  eset  yields  the  relation 

e-At  .  (e-At/2)(e-At/2,  .  (e-At/2)2>  (19) 

then  e ^  can  be  obtained  by  computing  the  direct  matrix  inverse  eAt  =  (e-At) 
This  strategy  combines  the  qualities  of  easy  implementation  and  efficient 
execution. 

IMPLEMENTATION 

The  implementation  strategy  for  incorporating  INTEG2  into  the 
HEATEST  program  was  quite  different  from  the  case  of  MEXP2.  MEXP2  repre¬ 
sented  a  complicated  arithmetic  process  which  was  performed  on  one  piece 
of  input  data,  i.e.,  the  A  matrix.  A  small  test  program  was  used  to 
prove  the  calculation  method  for  a  particular  sample  input  matrix,  and 
the  results  could  be  readily  verified. 

INTEG2  represented  a  different  situation.  The  calculation  of 
Simpson's  rule  was  relatively  simple,  but  it  was  technically  difficult 
to  cross-check  the  result.  The  complicated  structure  of  HEATEST  made 
it  difficult  to  extract  matching  A  and  Q  matrices  for  a  sample  input. 
Therefore,  the  following  implementation  strategy  was  used:  write  a 
program  module  for  INTEG2,  substitute  the  module  into  HEATEST,  and  compare 
the  resulting  parameter  estimations  and  error  covariance  to  those  of 
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the  previous  Taylor  series  method. 

The  programming  exercise  actually  consisted  of  writing  two  main 

modules:  INTEG2  and  TPS0SP3.  Originally,  TPS0SP2  constructed  the  entire 

Riccati  equation  by  consecutively  executing  MEXP  and  INTEG  to  compute 

the  first  and  second  terms  of  Equation  (1),  respectively.  The  revised 

structure  has  the  form  of  embedded  shells.  MEXP2  is  called  by  INTEG2  to 

-At/2 

calculate  the  matrix  exponential  e  .  INTEG2  uses  this  value  to 
compute  Simpson's  rule  in  evaluating  the  integral  function.  INTEG2 
is  called  in  turn  by  TPS0SP3  to  obtain  the  integral  value  and,  also, 
the  value  of  e"^*.  TPS0SP3  then  uses  a  matrix  inverse  module  to  obtain 
e^\  generates  the  first  term  of  Equation  (1)  from  eAt,  generates  the 
second  term  of  Equation  (1)  from  the  integral  value,  and  adds  the  terms 
to  obtain  the  final  result. 

Development  of  the  modules  INTEG2  and  TPS0SP3  was  divided  into 
two  stages.  The  first  and  most  important  stage  was  the  development  of 
software  manipulation  tools,  such  as  a  matrix  inversion  module.  The 
second  stage  was  the  combination  of  these  manipulation  tools  to  construct 
the  actual  computation  formula. 

MEXP2  satisfied  the  first  software  tool  requirement,  which  was 
to  compute  the  matrix  exponential  function.  The  second  tool  requirement 
was  for  a  module  to  perform  the  matrix  inverse  computation,  which  is 
a  fairly  simple  process  in  theoretical  terms.  However,  it  is  much  more 
of  a  challenge  to  program  this  function  in  an  efficient  manner  which 
does  not  magnify  floating-point  round-off  errors.  In  this  case,  the 
Unpack  User's  Guide,  1979  provided  efficient  and  reliable  code  for 

_ _  the  matrix  inverse  function.  The  negative  aspect  of  this  tool  was 

that  300  lines  of  code  were  added  to  HEATEST  in  order  to  perform  a 


29 


single  operation.  Evidently,  such  is  the  price  of  performance. 

The  Unpack  Code  was  given  a  dry  run  by  feeding  it  the  orthogonal 
transformation  matrix  U  which  is  displayed  in  Figure  10.  In  accordance 
with  theory,  the  resulting  inverse  matrix  was  exactly  equal  to  the 
transpose  of  U.  This  test  generated  a  high  level  of  confidence  in  the 
effectiveness  of  Unpack  module  SGEFA  and  Eispack  module  HQR,  and  also 
in  the  ability  of  the  programmer  to  successfully  implement  these  codes. 

The  third  software  tool  requirement  was  for  a  module  to  perform 
the  triple  matrix  multiplication  EQE^,  where  Q  is  presumed  to  be  a 
symmetric  matrix.  HEATEST  already  contained  a  module  named  MAT4  which 
seemed  to  perform  exactly  this  operation,  so  an  attempt  was  made  to 
interface  with  MAT4.  MAT4  was  also  very  attractive  because  it  apparently 
worked  with  an  exceptionally  small  number  of  intermediate  storage  arrays. 
Unfortunately,  the  construction  of  MAT4  was  nearly  impossible  to  decipher, 
because  it  used  one-dimensional  arrays  to  represent  two-dimensional 
matrices  in  a  rather  confusing  manner. 

An  attempt  was  made  to  validate  MAT4  by  incorporating  it  into  the 
MEXP2  test  program.  As  mentioned  in  a  previous  section,  MEXP2  calculated 
the  Schur  transformation  UTU^.  MEXP2  was  reprogrammed  several  different 
ways  over  a  period  of  two  weeks,  without  achieving  any  '•uccess  with  the 
use  of  MAT4.  Finally,  it  became  necessary  to  trace  tediously  by  hand 
the  operation  of  MAT4  on  some  sample  matrices  of  order  4-by-4.  The 
results  were  illuminating. 

MAT4  used  minimal  storage  space  because  it  only  calculated  the 
upper  triangular  half  of  the  resultant  matrix.  When  returning  the 
final  values,  MAT4  simply  assumed  that  the  resulting  matrix  was  symmetric. 
MAT4  had  been  ambiguously  documented  in  a  way  that  could  be  taken  to 
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mean  that  the  central  input  matrix,  rather  than  the  output  matrix,  was 
presumed  to  be  symmetric.  Therefore,  it  was  determined  that  MAT4  was 
useless  for  purposes  of  INTEG2,  and  a  new  module  named  TRI  was  written 
to  satisfy  this  final  software  tool  requirement,  TRI  was  then  incorporated 
into  the  MEXP2  test  program  as  a  cross-check,  with  perfect  results. 

At  this  point,  all  the  necessary  software  tools  were  available 
and  proven.  The  modules  INTEG2  and  TPS0SP3  were  then  written  to  con¬ 
struct  the  Riccati  equation  from  the  software  tools,  and  these  modules 
were  then  substituted  into  the  HEATEST  program.  This  second  stage  was 
performed  with  a  certain  amount  of  trepidation  about  the  computation 
results  that  would  be  produced.  The  software  tools  had  all  been  meticulously 
proven,  but  the  final  evaluation  depended  on  combining  a  large  number 
of  theoretical  manipulations  into  a  couple  of  program  module  quantum 
jumps.  It  seemed  unlikely  that  so  much  theory  could  be  included  in  the 
module  TPS0SP3  without  an  inevitable  flaw,  and  it  seemed  unlikely  that 
such  a  flaw  could  be  easily  isolated. 

When  these  modules  were  eventually  incorporated  into  HEATEST, 
it  so  happened  that  one  dummy  parameter  was  inadvertently  omitted  from 
a  subroutine  call.  This  single  mistake  caused  several  discouraging 
interface  complications.  It  was  necessary  to  expend  considerable 
effort  to  merely  discover  the  existence  of  a  logical  error.  Finally, 
a  simple  debugging  trace  was  used  to  locate  the  problem  module  and  to 
correct  the  missing  input  parameter. 

RESULTS 

I NTEG2  was  successfully  incorporated  into  the  HEATEST  program, 
and  execution  times  were  measured  for  INTEG  versus  INTEG2  over  a  range 
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of  matrix  dimensions.  The  results,  displayed  in  Table  2  and  Figure  15, 
were  as  follows: 


V 


A.  The  time  curves  in  Figure  15  show  a  significant  improvement 
in  execution  speed  when  using  Simpson's  rule.  It  should  be  mentioned 
that  the  curve  for  INTEG2  represents  only  the  time  difference  between 
TPS0SP2  and  TPS0SP3  that  resulted  from  the  use  of  Simpson's  rule,  and 
does  not  reflect  the  simultaneous  benefit  of  the  improved  calculation  of 
the  matrix  exponential. 

B.  The  desired  aerothermodynamic  parameters  appear  to  converge 
to  the  same  values  whether  using  INTEG  or  INTEG2.  However,  the  use  of 
INTEG2  seems  to  require  a  couple  of  extra  iterations  of  NEATEST  to  obtain 
the  same  amount  of  convergence  as  before.  This  is  consistent  with  a 
simultaneous  change  from  2.45  to  3.28  in  the  accumulated  covariance  of 
error  between  predicted  and  measured  temperature  that  also  occurred. 

Some  of  this  difference  can  be  attributed  to  a  magnification  of  roundoff 
error  that  occurred  during  the  inversion  of  the  matrix  exponential  e"At, 
which  probably  destroyed  about  four  decimal  places  of  accuracy.  The 
rest  of  the  error  must  be  taken  to  reflect  the  inherent  limitations  of 
Simpson's  rule.  However,  the  HEATEST  program  runs  dramatically  faster 
with  module  INTEG2,  even  when  extra  iterations  are  taken  into  account. 


V.  ANALYSIS  AND  CONCLUSIONS 


NORMS,  CONDITION  NUMBER  AND  ACCURACY 


The  scaling  and  squaring  series  method,  used  in  both  modules 

MEXP  and  INTEG,  is  formulated  on  the  concept  of  a  matrix  norm.  A  matrix 

norm  can  be  defined  in  several  different  ways  for  theoretical  purposes, 

but  the  general  idea  is  to  provide  a  comparative  measure  of  the  size 

of  a  matrix  in  terms  of  the  magnitude  of  its  eigenvalues,  in  section 

two  of  this  paper,  it  was  noted  that  the  norm  of  matrix  A  must  have  a 

value  less  than  1  in  order  to  prevent  the  growth  of  intermediate  series 

elements.  The  scaling  and  squaring  method  divides  A  by  a  factor  2 

such  that  jJ^j/Z^sl .  The  module  MEXP  computes  the  matrix  exponential 

e^ty^2  ,'and  the  result  is  squared  k  times.  Since  the  largest  eigenvalue 

in  a  sample  matrix  A  was  found  to  be  on  the  order  of  28,000,  it  was 

expected  that  the  scaling  factor  k  would  probably  be  tremendous  because 

the  norm  would  be  very  large.  However,  it  was  found  that  the  module 

TPS0SP2  represents  time  in  hour  units,  even  though  the  actual  temperature 

samples  were  taken  every  second  in  real  time.  Therefore,  the  value  of 

At 

t  is  always  less  than  1/3600,  so  the  scaling  effect  on  e  was  much 
less  than  expected.  The  module  XN0RM  was  used  by  MEXP  and  INTEG  to 
estimate  the  matrix  norm.  The  value  of  XN0RM  was  examined  during 
executions  of  HEATEST  over  a  range  of  matrix  dimensions,  and  the  estimated 
norm  was  always  in  the  range  75,000-77,000  with  a  scaling  factor  of 
k  =  5.  The  norm  was  about  half  this  large  on  all  succeeding  iterations 
of  HEATEST. 
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Every  matrix  can  be  assigned  a  condition  number  which  indicates 
the  amplification  of  roundoff  error  that  can  occur  when  solving  a 
linear  system,  such  as  when  calculating  the  matrix  inverse.  A  large 
condition  number  indicates  that  the  linear  system  is  very  close  to 
being  linearly  dependent,  so  the  matrix  is  nearly  singular.  A  similar 
problem  can  arise  in  the  solution  of  matrix  differential  equations 
where  the  eigenvalue  ratio  Jemax/emin|  very  large,  in  which  case  the 
matrix  is  said  to  be  "stiff". 

The  A  matrices  of  HEATEST  exhibit  some  stiffness  because  the 
nodes  of  the  temperature  grid  are  necessarily  closer  together  near  the 
top  and  bottom  surfaces  of  the  Shuttle  insulation.  This  is  done  in 
order  to  accurately  represent  the  interfaces  between  the  layers  of  the 
insulation,  but  the  result  is  a  matrix  with  a  rather  poor  condition  for 
computations.  For  example,  the  matrix  in  Figure  11  can  be  seen  to  have 
diagonal  eigenvalues  that  range  from  0.9  to  28,000.  (For  a  discussion 
of  condition  numbers,  refer  to  Young  and  Gregory,  1973,  Vol .  II:  564, 
811,943) 

The  condition  number  of  the  HEATEST  matrices  was  determined  by 

calculating  e  /e  for  those  eigenvalues  on  the  main  diagonal  of  all 
3  max  min  3  3 

triangular  matrices  in  module  MEXP2.  For  matrices  of  order  13  and  30, 

the  condition  number  during  the  initial  time  propagation  of  HEATEST  was 

found  to  be  in  the  range  75,000  to  77,000,  similar  to  the  value  of  the 

estimated  norm.  This  would  indicate  an  expected  loss  of  4  to  5  digits 

of  precision  when  calculating  a  matrix  inversion,  such  as  the  case  of 

At  -At 

TPS0SP3  in  obtaining  e  from  e 

Matrix  transformations  also  provide  an  opportunity  for  the 
condition  number  to  cause  error.  This  particularly  applies  to  the 
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matrix  S-1  which  is  used  in  the  diagonal  transformation  A  =  SDS-1.  One 

advantage  of  using  the  Schur  triangular  transformation  A  =  UTU"*  is  that 

all  orthogonal  matrices  U  have  a  condition  number  of  1,  so  no  additional 

error  is  created  during  matrix  multiplication  or  inversion. 

The  accuracy  of  the  module  HQR  (that  calculates  the  matrix  U) 

is  determined  by  an  internal  parameter  named  MACHEP  which  specifies 

the  precision  goal  of  the  QR  algorithm.  The  value  of  MACHEP  was  varied 

-5  -13 

over  the  range  from  10  to  10  in  order  to  determine  the  effect  on 

HEATEST.  The  results  showed  that  the  QR  algorithm  must  converge  very 

quickly,  because  the  total  execution  time  of  HEATEST  only  increased 

from  178  seconds  to  179  seconds  as  the  precision  value  of  MACHEP  was 

increased.  Therefore,  all  further  use  of  HQR  was  done  with  the  value 

-13 

of  MACHEP  set  to  10  ,  which  is  the  single  precision  limit  of  the 

CYBER  computer.  The  values  of  eAt  were  then  compared  for  matrices  of 
order  8  after  computation  by  MEXP  and  by  MEXP2.  The  results  of  the 
two  methods  showed  an  agreement  of  10  to  13  significant  digits.  It 
is  difficult  to  judge  which  method  is  the  more  precise. 


The  computation  of  the  Riccati  integral 


(19) 


by  Simpson's  rule  has  a  theoretical  error  bound  of 

E  =  -  -4  [e"AtQe’A  t] 

n  180  r  dT 


where  n/2  =  1  is  the  number  of  subintervals  in  the  approximation, 

according  to  Levin,  1969:410.  Since  time  is  represented  in  hour  units 

and  it _  =  1/3600,  it  would  seem  that  the  expected  error  would  be 

max  r 


close  to  the  limit  of  computer  precision.  However,  the  results  of  modules 
INTEG  and  INTEG2  were  compared  for  matrices  of  order  8,  and  the  answers 
showed  agreement  only  of  5  to  9  significant  digits  in  the  mantissa 
(right  of  the  decimal  point). 

Evidently,  the  accuracy  of  module  INTEG2  is  inferior  to  that  of 
INTEG,  because  the  covariance  of  the  a  priori  temperature  propagation 
was  increased  slightly,  from  2.45  to  3.28,  for  matrices  of  order  13. 

This  seems  to  explain  the  observation  that  the  aerothermodynamic  para¬ 
meter  values  appear  to  converge  somewhat  slower  per  iteration  of  HEATEST 
when  using  INTEG2.  However,  the  parameters  seem  to  converge  to  the  same 
values  whether  using  INTEG  or  INTEG2  to  obtain  the  result.  It  can  be 
seen  in  Figure  16  that  INTEG2  is  much  more  efficient  in  execution  time. 

EXECUTION  PERFORMANCE 

Analysis  of  the  loop  structure  of  module  MEXP  shows  that  the  number 
of  floating  point  operations  required  for  each  execution  is 

2n4  +  (k  -  3)n3  +  (3/2)n2  +  79n  +  2k  +  3 

where  n  is  the  matrix  order 

k  is  the  average  scaling  and  squaring  factor 
According  to  Parlett,  1974:4,  the  operation  count  of  module 

o  o 

MEXP2  is  only  lOn  to  15n  for  the  Schur  decomposition  plus  (l/3)n 
to  build  up  the  triangular  function  matrix. 

Analysis  of  the  loop  structure  of  module  INTEG  shows  that  the 
floating  point  operation  count  for  one  execution  is 

(25  +  (5/2)k))n3  +  (35/2)n2  +  16n  +  45  +  2k 

where  n  is  the  matrix  order 
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k  =  5  is  the  average  scaling  and  squaring  factor 

3  2 

Module  INTEG2  has  an  operation  count  of  only  6n  +  2n  ,  which  con 

sists  mainly  of  5  matrix  multiplications  and  1  matrix  inversion. 

The  overall  effect  of  these  improvements  on  the  HEATEST  program 

is  displayed  dramatically  in  Figure  16.  This  graph  shows  that  the 

large  coefficients  in  the  operation  count  of  INTEG  have  a  greater  effect 

4 

on  matrices  of  small  order  than  the  n  term  in  the  operation  count  of 
module  MEXP. 

SOFTWARE  DESIGN  CRITIQUE  OF  THE  HEATEST  PROGRAM 

The  most  favorable  design  feature  of  the  HEATEST  program  is  the 
manner  in  which  it  is  organized  into  submodules,  each  of  which  performs 
a  well  defined  function.  This  feature  was  conducive  to  further  software 
development  because  of  the  ease  with  which  alternative  methods  could  be 
substituted  into  the  program. 

The  negative  software  features  of  HEATEST  consisted  of  inadequate 
documentation,  lack  of  source  references  for  utility  modules,  and  a  lack 
of  top-down  logical  program  structure  within  the  submodules.  These 
features  prompted  some  classic  software  development  problems  during  the 
course  of  this  project.  These  situations  are  listed  as  follows: 

1.  During  the  investigation  of  modules  MEXP  and  INTEG  it  could 
have  been  instructive  to  print  out  the  values  of  the  intermediate  series 
terms.  This  would  have  demonstrated  the  extent  of  the  growth  of  inter¬ 
mediate  series  elements,  and  it  might  have  shown  whether  the  series 
could  be  truncated  after  a  smaller  number  of  terms  were  computed. 
Unfortunately,  this  investigation  was  precluded  because  the  internal 
coding  of  these  modules  was  almost  indecipherable.  MEXP,  INTEG,  and 
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several  other  software  modules  had  been  procured  from  a  commercial 
source,  and  the  only  available  documentation  was  limited  to  a  descrip¬ 
tion  of  inputs,  outputs,  and  the  general  computational  strategy. 

This  black  box  program  construction  was  understandable  since 
HEATEST  was  originally  written  as  a  prototype  in  a  research  environment, 
where  internal  documentation  was  a  secondary  consideration.  However,  this 
feature  proved  to  be  an  obstruction  to  the  research  effort  of  this 
thesis.  The  replacement  modules  MEXP2  and  INTEG2  should  be  a  software 
engineering  improvement  to  the  HEATEST  program  because  these  modules 
have  a  simple  sequential  internal  structure,  all  intermediate  steps 
have  descriptive  comments,  and  source  references  are  listed  in  each 
major  submodule. 

2.  Documentation  is  almost  non-existent  throughout  the  body  of 
the  HEATEST  program  code,  and  some  of  the  few  existing  comments  are 
misleading.  As  a  consequence,  this  author  experienced  two  weeks  of 
delay  in  a  fruitless  effort  to  interface  INTEG2  with  module  MAT4. 

Now  that  HEATEST  is  becoming  established  as  a  useful  tool,  it  should 
be  upgraded  with  descriptive  comments  at  the  earliest  opportunity. 

This  will  preclude  the  inevitable  problems  from  insufficient  program 
clarity. 

3.  A  final  example  illustrates  the  inadequate  documentation  of 
the  HEATEST  program.  After  the  development  of  module  MEXP2,  it  was 
necessary  to  analyze  the  structure  of  module  TPS0SP2  in  order  to 
properly  interface  them.  Although  J.K.  Hodge  is  the  person  most 
familiar  with  the  programming  details  of  HEATEST,  even  he  found  it 
necessary  to  consult  Audley,  1982  in  order  to  decipher  some  of  the 
mathematical  functions  of  module  TPS0SP2.  At  least  these  particular 


mathematical  functions  are  now  well  documented  in  the  replacement  module 
TPS0SP3. 

SOFTWARE  DEVELOPMENT  PROBLEMS 

As  anticipated,  there  were  two  types  of  development  problems  that 
hindered  the  pace  of  this  project.  These  were: 

1.  Programming  language  difficulties 

2.  Use  of  the  AFIT  Cyber  computer  resource 
These  problems  are  discussed  in  detail  below. 

It  was  anticipated  that  the  Fortran  77  standard  (Fortran  5) 
might  prove  to  be  a  difficulty,  since  this  user  had  last  programmed 
in  Fortran  about  12  years  prior  to  the  start  of  this  project.  As  it 
turned  out,  the  major  language  problems  were  caused  by  the  deficiencies 
of  Fortran  in  representing  variable  size  storage  arrays.  This  feature 
caused  several  conflicts  when  the  first  attempts  were  made  to  interface 
the  replacement  modules  into  the  HEATEST  program.  In  order  to  overcome 
this  difficulty,  it  was  necessary  to  transmit  the  storage  arrays  through¬ 
out  the  entire  HEATEST  program  by  means  of  either  COMMON  statements  or 
as  dummy  subroutine  parameters.  The  danger  of  this  strategy  is  that 
later  changes  to  a  local  module  can  cause  global  errors  to  propagate 
throughout  the  entire  program. 

Another  type  of  language  problem  was  caused  by  the  practical  need 
to  use  HEATEST  to  process  temperature  node  sets  of  various  dimension. 

The  HEATEST  program  is  actually  composed  of  several  modules  that  are 
stored  in  special  form  in  a  Cyber  UPDATE  library.  At  the  time  of  pro¬ 
gram  compilation,  there  are  various  correction  sets  that  must  be  used 
to  provide  COMMON  statements  that  define  the  appropriate  dimensions  of 


the  various  storage  arrays.  This  is  a  very  flexible  arrangement  for 
the  dedicated  user  who  must  process  assorted  data  sets.  However,  the 
complexity  of  the  correction  sets  and  the  unfamiliar  UPDATE  control 
language  were  impediments  to  this  user  during  the  software  development 
process. 

The  characteristics  of  the  AFIT  Cyber  computer  environment  caused 
continual  frustration  and  difficulty  throughout  the  entire  course  of 
this  project.  It  will  suffice  merely  to  list  the  following  problems 
that  resulted  from  the  administration  of  Cyber  accounts  and  the  idio- 
syncracies  of  the  Cyber  computer  operating  system: 

1.  Accidental  revocation  of  the  computer  password  for  this 
project. 

2.  Periodic  erasure  of  several  working  files,  and  the  mysterious 
purging  of  one  complete  UPDATE  library. 

3.  Non-availability  of  computer  terminals,  even  late  at  night 
and  on  weekends. 

4.  Program  turnaround  times  of  24  to  48  hours. 

5.  Occasional  shutdov/ns  of  the  Cyber  computer  system  for  several 
days,  due  to  maintenance  problems  and  system  reorganization. 

6.  Unfriendly  characteristics  of  the  Cyber  interactive  operating 
system  and  its  primitive  line-oriented  editor  utility. 

FURTHER  AREAS  FOR  INVESTIGATION 

Due  to  time  limitations,  there  are  several  aspects  of  this 
project  that  could  not  be  sufficiently  examine.  These  topics  are 
described  here  in  detail. 

1.  Examination  of  the  intermediate  terms  in  the  scaling  and 


squaring  Taylor  series.  These  terms  could  be  checked  to  discover  the 
existence  of  any  intermediate  element  growth.  It  is  also  possible 
that  the  series  could  be  further  truncated,  especially  module  MEXP 
which  sums  (25  +  n)  terms  where  n  is  the  order  of  the  matrix  A.  This 
could  not  be  examined  because  modules  MEXP  and  INTEG  were  indecipherable 
and  would  probably  need  to  be  reconstructed. 

2.  Moler  and  Van  Loan,  1978:828  state  that  the  scaling  and 
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squaring  method  can  be  implemented  with  the  order  of  n  floating  point 
operations.  Loop  analysis  of  module  MEXP  showed  0 ( n  )  operations,  so 
this  could  be  an  inefficient  implementation.  An  attempt  could  be  made 
to  rewite  MEXP  with  0(n  )  operations,  and  to  examine  the  efficiency  of 
this  implementation  in  comparison  to  MEXP2,  which  used  the  Schur  decompo¬ 
sition. 

3.  Use  the  algorithm  that  was  developed  in  Section  3  of  this  paper 
for  matrix  multiplication  of  triangular  matrices.  This  algorithm  could  be 
used  in  place  of  Parlett's  method  to  make  a  series  computation  of  the 
matrix  function,  after  performing  the  initial  Schur  triangular  decompo¬ 
sition.  This  method  would  probably  be  inferior  to  Parlett's  method, 
however. 

4.  Investigate  the  efficiency  of  the  diagonal  transformation 

A  =  SDS”*.  This  matrix  decomposition  is  now  feasible  because  analysis 
of  the  nature  of  the  A  matrix,  from  section  3  of  this  paper,  proved  that 
a  distinct  set  of  eigenvalues  will  always  exist.  The  speed  of  this 
method  would  depend  on  the  efficiency  of  the  algorithm  which  constructs 
the  transformation  matrix  S.  This  method  would  probably  be  inferior 
to  the  Schur  decomposition  in  accuracy,  however,  due  to  the  condition 
number  of  the  A  matrices.  There  would  be  a  probable  loss  of  4  to  5 


significant  digits  when  performing  matrix  inversion  to  obtain  the 

transformation  matrix  S"^ .  A  further  loss  of  accuracy  would  occur  when 

the  matrix  e"At  is  inverted  in  module  TPS0SP3  to  obtain  the  value  of 
At 

e  for  calculating  the  first  term  of  the  Riccati  equation. 

5.  Further  analytical  and  computational  effort  could  be  made  to 
investigate  the  accuracy  of  the  results  of  Simpson's  rule  from  module 
INTEG2.  It  is  not  exactly  clear  what  effect  this  accuracy  has  on  the 
covariance  of  temperature  propagation  in  the  HEATEST  program.  The 
covariance  appears  to  be  somewhat  greater,  but  the  extent  of  the  increase 
is  not  well  known.  The  effect  of  this  Covariance  increase  on  the  con¬ 
vergence  of  the  aerothermodynamic  parameter  values  is  also  in  question. 


CONCLUSIONS 

Matrix  decomposition  methods  can  be  much  more  efficient  than  the 


familiar  Taylor  series  for  computing  the  matrix  exponential  function, 
especially  as  the  order  of  the  matrices  is  increased.  Many  methods 
exist  for  computing  the  matrix  exponential,  but  only  a  few  are  competitive 
in  both  speed  and  accuracy.  The  scaling  and  squaring  series  method  is 
both  simple  and  accurate,  but  the  Schur  decomposition  with  Parlett's  method 
for  evaluating  matrix  functions  seems  to  have  comparable  accuracy  and 
much  greater  computational  efficiency. 


The  scaling  and  squaring  series  method  can  also  accurately  compute 


the  Riccati  integral 


-At  -A't 

e  Qe  dt  but  computational  efficiency 


of  this  method  is  greatly  degraded  when  the  A  matrix  has  a  large  compu¬ 
tational  condition  number.  In  this  situation,  a  numerical  integration 
method  such  as  Simpson's  rule  can  provide  a  dramatic  increase  in  the 


speed  of  execution  at  the  cost  of  some  loss  of  accuracy. 

The  matrix  Riccati  equation  can  be  reliably  computed  for  the 
HEATEST  program  with  a  considerable  reduction  in  execution  time  by 
means  of  the  alternative  numerical  methods  discussed  above.  The  covariance 
of  temperature  propagation  is  somewhat  increased,  but  the  slower  iterative 
convergence  that  results  is  clearly  outweighed  by  the  increased  efficiency 
of  overall  program  execution. 

The  results  of  this  thesis  have  been  of  immediate  benefit  to  the 
U.S.  Air  Force.  Within  one  week  after  these  improved  methods  were  per¬ 
fected,  the  program  code  was  transmitted  to  the  Air  Force  Flight  Test 
Center  at  Edwards  AFB.  The  improved  computational  efficiency  of  the 
HEATEST  program  will  not  only  save  on  budget  money  for  computer  time, 
but  it  also  will  allow  the  use  of  larger  matrices  that  can  model  deeper 
heat  penetration  into  the  Shuttle  thermal  shield.  This  will  result  in 
a  better  definition  of  the  envelope  of  reentry  trajectories  for  the 
Orbiter  vehicle  and,  therefore,  an  increase  in  its  effective  mission 
capability. 

The  improved  HEATEST  program  is  already  in  extensive  use  by  the 
AFIT  Aerodynamics  Department,  where  program  turnaround  times  have  been 
reduced  from  one  week  to  less  than  24  hours.  As  a  result,  new  research 
is  being  done  about  possible  coupling  effects  of  the  aerothermodynamic 
parameters.  The  HEATEST  program  also  has  a  potential  for  application 
to  the  reduction  of  wind  tunnel  data  by  the  Air  Force  Flight  Dynamics 
Laboratory,  now  that  its  execution  cost  has  been  substantially  reduced. 


Figure  4 


1  1 
1  1 
1  1 
0  1 
0  1 


Tridiagonal  Matrices  Exhibit  Bandwidth 
Spread  Under  Matrix  Multiplication 
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Figure  5.  Triangular  Matrices  Retain  Their  Form 
Under  Matrix  Multiplication 


Figure  6.  Matrix  Multiplication  by  Elements 

Lower  Triangular  Elements  are  Underlined 
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A9B9 

Figure  7.  Remaining  Terms  When  Lower  Triangular 
Elements  are  all  Zero 
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Figure  10.  Orthogonal  Schur  Transformation  Matrix 


(CbpbAXl_3/2  +  CqPqAXq/2 )U^_2  =  Kg/AXL_3 

-  (KB/aXL-3  +  KC/aXC>UL-2  +  KC/AXC  UL-1 

(CcpcaXc/2  +  CqPjjAXjj/2  )U^_j  =  Kc/aXcUl_2 

-  <VAXc  +  kd/aXd)ul-i  +  kd/aXdul 
cdpdaXd/2°l  =  *VAXDUL-1  '  *Vaxdul 


Figure  12.  System  Equations  Which  Generate  Matrix  "A 


Comparison  Data 

> 

Matrix  Size 

Table  I 

for  Modules  MEXP  and  MEXP2 

Execution  Time  (Seconds) 

s 

MEXP 

MEXP2 

1 

5 

x  5 

.014 

.012 

8 

x  8 

.071 

.037 

5 

x  9 

.110 

— 

k  10 

x  10 

.163 

.064 

11 

x  11 

.232 

— 

i  12 

x  12 

.327 

— 

13 

x  13 

.435 

.119 

1  14 

x  14 

.595 

— 

i  15 

x  15 

.778 

.198 

•'  16 

x  16 

1.016 

— 

17 

x  17 

1.267 

— 

^  20 

x  20 

— 

.396 

25 

x  25 

— 

.718 

30 

x  30 

1.142 

eA(s  +  t)  =  eAseAt 


Proof: 


Using  the  series  expansions  for  the  three  exponentials  and  the 


fact  that  absolutely  convergent  series  may  be  rearranged  in  arbitrary 


fashion,  we  have 


As  At 
e  e 


'  y*»  a^s^  "  Amtm  " 

-  k=0  k!  J  Lm=0  m!  - 


oo 

£  An  [  £  sktm  ‘ 
n=0  L  k+m=n  k ! m ! 


-  E  * 


n  (s  +  t)n 


=  gA(s+t) 


Ref:  Bellman,  1970:  p.  170 


Figure  14.  Proof  of  Matrix  Exponential  Theorem 


56 


ft 


Table  II 

Comparison  Data  for  Modules  INTE6  and  INTEG2 


Matrix  Size 

Execution  Time 

(Seconds) 

INTEG 

INTEG2 

8x8 

.170 

.020 

10  x  10 

.314 

.033 

13  x  13 

.649 

.062 

15  x  15 

.981 

.088 

20  x  20 

2.233 

.175 

25  x  25 

4.251 

.311 

30  x  30 

7.272 

.509 
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Table  III 


«r 


Total  Execution  Time  of  the  Heatest  Program  Using 
Module  Substitutions 

Execution  Time  Includes  Initial  Time  Propagation 
Plus  One  Parameter  Iteration 


Matrix  Size 

Execution 

Time 

(Seconds) 

MEXP, 

MEXP2 

,  MEXP2, 

INTEG 

INTEG 

INTEG2 

8x8 

74.5 

68.6 

38 

10  x  10 

116.3 

109.2 

50 

13  x  13 

266.8 

198.3 

76 

15  x  15 

413.5 

290.6 

103 

20  x  20 

— — 

— 

190 

23  x  23 

— 

— 

252 

CVJ 

X 

CVJ 

_  -  - 

— 

287 

26  x  26 


352 


Figure  16.  Total  Execution  Time  of  HEATEST  Program 
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Figure  17.  Structure  Chart  of  Original  Program  Modules 


SUBROUTINE  TPSOSP2  (DT) 

DIMENSION  PHI(8,8),PC(8,R),QD(8,8),QDT(R,8) 
DIMENSION  QUE(8,8) ,  A( 8 , 8  ) 

C OM MON/MAIN/ NPTPC.NPTSS, PH  I, PC , QD , QDT , QUE , A 

IF ( DT  .  LE .  O.O)  GO  TO  999 

DELT-DT/3600. 

C 

C  COMPUTE  LINEARIZED  STATE  TRANSITION  MATRIX,  PHI(K,K+1) 

c 

NPT  S  S -N  PT  PC 

CALL  MEXP(NPTSS, A( 1 , 1 ) , D E LT , PHI (1 , 1 ) 

C 

COMPUTE  MODEL  NOISE  COVARIANCE,  QD(K+1) 

C 

CALL  EQUATE  (  NPTSS,  NPT  SS,QD(1  ,  1)  ,QUF.(1  ,  1  ) 

CALL  MSCALE(NPTSS,NPTSS,A(l,l),-I.O,A(l,l) 

CALL  INTEG(NPTSS,A(l,l),QD(l,l),QDT(l,l),DFLT) 

CALL  MAT4 (NPTSS, NPTSS, QDT( 1.1), PH  1(1, I) ,00(1,1)) 
CALL  MSCALE (NPTSS, NPTSS, A(1,1),-1.0,A(1, 1)) 

Cl 

C  COMPUTE  ONE-STEP  PREDICTED  APRIORI  COVARIANCE,  PC(K+1) 
0 

CALL  M AT4 (N PTSS, NPTSS, PC(1,1), PH  1(1,1), QDT (1,1)) 

DO  520  IPC-l, NPTSS 
DO  520  JPC-l, NPTSS 

520  PC(IPC,JPC)  -  QDT (I PC ,  JPC)+QO(IPC , JPC) 

NPT  S  S  =*I  NF 
999  RETURN 
END 


SUBROUTINE  MF.XP  (  N  ,  A ,  T  ,  F  A  ) 

DIMENSION  A(l),EA(l),C(30),n(3t),E(30) 
COMMON/MAINI/NDIM,X(l ) 

NDIM-NDIM+1 

NN-NDIM*N 

NMl-N-1 

I F ( N  .GT.  1)  GO  TO  5 
F.A(1)*EXP(T*A(1)) 

RETURN 
5  W-l .0 

DO  10  1*1  , NN , NDIM 

IL*I+NM1 
DO  10  J*I,IL 
10  EA(J)*A(J) 

C1«XN0RM(N, A) 

I N  D*0 
L*l 
T 1  *T 

15  IF( ABS(TI*C1) .  LE .  3.0)  GO  TO  20 
Tl-Tl/2. 

IND-IND+l 
GO  TO  15 
20  C 2  =0 . 

DO  25  1*1, NN , NDIM l 

25  C2=C2-EA(I) 

C2*C2/FLOAT(L) 

C(L)*C2 
D(L+l )-0. 

II-N+l-L 
E  (  1 1 )  «W 
II-l 

DO  35  1*1 ,NN,NDIM 

IL-I+NM1 
DO  30  J*I,IL 
30  X(J)*F.A{J) 

X ( 1 1 ) *X ( 1 1 )+C  2 
35  II-II+NDIMl 

I F ( L  .EQ.  N)-  CO  TO  40 
CALL  MMUL(X,A,N,N,N,F.A) 

W-V*Tl /FLOAT(L) 

L-L  +  l 
GO  TO  20 
40  CONTINUE 


CAN  CHECK  X : 0  FOR  ACCURACY 
J*N  +  2  5 

DO  50  L  =  N ,  J 
DO  45  K*1,N 

D( K ) =D(K+1)-W*C(K))*T1/ FLOAT (L) 
45  E(K)=F,(K)  +  D(K) 

50  U*D(1) 

11*1 

DO  60  1*1 , NN , NDIM 

IL-l+NMl 
DO  55  J *1 , IL 
55  E A( J ) =E ( 1 ) *A( J  ) 

EA( 1 1 )  »E  A(  1 1  )+E  (  2  ) 

60  11  =  1 1  +  N  D 1M 1 

I F  (  N  .EQ.  2)  CO  TO  R5 

DO  80  L  =  3 ,  N 

CALL  MMUL(EA,A,N,N, N ,  X  ) 

11  =  1 
C2=E(L) 

DO  75  1=1, NN, NDIM 

IL-l+NMl 
DO  70  J-l.IL 
70  E A( J ) =X ( J ) 

EA(II)=EA(II)+C2 
75  1 1  =  1 1+NDIMl 
80  CONTINUE 

85  IF(  I ND  .F.Q.  0)  RETURN 
DO  100  L-l.IND 
DO  00  1=1, NN, NDIM 

IL-l+NMl 
DO  00  J  »I , I L 
00  X(J)=EA(J) 

100  CALL  MMUL(X,X,N,N,N,EA) 

RETURN 


SUBROUTINE  INTEG(N, A.C.S ,T) 
S-INTECRAL  E A*C*EA '  FROM  0  TO  T 
C  IS  DESTROYED 

DIMENSION  A(l),C(l) , S ( 1 ) , COE F ( I  5 ) 
rOMM ON/M AI N1 /NDIM , X ( 1 ) 

NDIMl-NDIM  +  l 
NN-N*NDIM 
NM1-N-1 
NT-l  3 
NTMl-NT-1 

I  ND-0 

ANORM -XNORM ( N  ,  A ) 

DT-T 

5  IF( ANORM*ARS(DT)  .LE.  0.5)  GO  TO  10 
DT-DT/2 . 

IND-IND+1 
GO  TO  5 

10  DO  15  1-1 , NN , NDIM 

J-I+NM1 
DO  15  JJ-I.J 
15  S(JJ)-DT*C( JJ) 

Tl-DT**2/2  . 

DO  25  IT-3,17 

CALL  MMUL(A,C,N,N,N,X) 

DO  20  I -1 , N 
II-( 1-1 )*NDIM 
DO  20  J  J-I , NN , N  DIM 
II-II+I 

C(JJ)-(X(JJ)+X(II))*Tl 
-tO  S(  JJ)-S(  JJ)+C(  JJ) 

25  T1-DT/FL0AT(IT) 

IF ( I  NO  .EO.  0)  GO  TO  100 
COEF ( NT ) -1 .0 
DO  30  I-l.NTMl 
1 1 -NT- 1 

JO  COF. F(TI)-DT*C0EF(II  +  1  )/FLOAT(I) 

DO  40  1-1  ,  NN  ,  N  D  IM 

I I  - 1 
J-I+NMl 

DO  35  JJ«I,J 
35  X(JJ)»A(JJ)*COEF(I  ) 


X(II)«X(II)+C0EF(2) 

40  II-II+NDIM1 
DO  55  L*3 , NT 
CALL  MMUL(A,X,N,N,N,C) 
II-l 

Tl-COEF(L) 

DO  55  I  ■! , NN ,  NDIM 

J-I+NM1 

DO  50 

50  X(JJ)*C(JJ) 

X ( 1 1 ) *X (II )+T 1 
55  II-II+NDIM1 
X=*EXP(A*DT) 

L-0 

60  L-L+l 

CALL  MMUL(X,S,N,N,M,C) 
II-l 

DO  90  I-l.N 
J-II 

I F ( I  .EQ.  1)  GO  TO  75 
DO  70  JJ*I , II, NDIM 
S(JJ)-S(J) 

70  J-J+l 

75  DO  85  JJ-l.N 
KK-JJ 

DO  80  K-1,NN,NDIM 
S(J)=S(J)+C(K)*X(KK) 

80  KK-KK+NDIM 
85  /-J+NOIM 

00  87  JJ-I,NN,NDIM 
87  C(JJ)-X(JJ) 

90  1 1 ■ 1 1 +N DIM 

IF  ( L  .EQ.  IND)  GO  TO  100 
CALL  MHUL(C,C,N,N,N,X) 

GO  TO  60 

100  CONTINUE 
RETURN 
END 


SUBROUTINE  MMUL ( X , Y , N 1 , N2 , N 3 , Z ) 

DIMENSION  X( l  )  ,  Y( 1 ) , Z( l ) 

COMMON/MAIN1/NDTM 

NEND3=NDIM*N3 

NEND2=NDIM*N2 

DO  1  1=1, NI 

DO  1  J “I , NEND3 , NDIM 

TM=  0. 

K  =  I 

KK-J-1 
5  KK-KK+1 

TM-TM+X(K)*Y(KK) 

K-K+NDIM 

I F ( K  .LE.  NEND2)  00  TO  5 
1  2(J)-TM 
RETURN 
END 


FUNCTION  XNORM ( M , A) 

COMPUTES  AN  APPROXIMATION  TO  NORM  OF  A.  NOT  A  BOUND. 
DIMENSION  A (  1 ) 

C0MM0N/MAIN1 /NDIM 
NOIMl-NDIM+l 
NN  =  N*N  DIM 
Cl-O. 

TR-A(l) 

I F( N  .EQ.  1)  GO  TO  20 
1-2 

DO  10  II=NDIM1,NN,NDIM 
J-II 

DO  5  JJ- 1, 1 1, NDIM 
C1=CI+ABS( A(J ) *A( J  J  ) ) 

5  J-J  +  l 

TR=TR+A( J ) 

10  1=1  +  1 

TR  =TR / FLOAT( N ) 

DO  15  11=1 ,NN, NDIM l 

15  Cl=Cl  +  (A( 1 1 )  — TR)**2 
20  XNORM  =  ABS(TR )+SQRT( Cl ) 

RETURN 

F.ND 


SUBROUTINE  M AT4 ( N 1 , N2 , XYZ  ) 

Z-YXY"  X=X"  IS  N2XN2  ,  Y  IS  N1XN2,  Z  IS  NIXNl 
DIMENSION  X( 1 ) , Y ( 1 ) , Z( I ) 

COMMON/MAINl/NDIM 

CALL  MMUL(Y,X,Nl,N2,N2,Z) 

NN2-N2*NDIM 
DO  3  1=1, N1 

IMl-  1-1 
II-IM1*NDIM 
JJ-I+II 
DO  2  J  =  1 , N 1 
TEMP*  0. 

KK-J 

DO  1  K-I , NN2 , NDIM 
TEMP=TEMP+Y(K)*Z(K,K) 

1  KK-KK+NDIM 
Z(  JJ)-TEMP 

2  JJ-JJ+NDIM 

JJ-I 
K-I 1  +  1 
KK-II+IMl 
DO  3  J-K.KK 
Z( JJ)-Z( J) 

JJ-JJ+NDIM 

3  CONTINUE 
RETURN 
END 


n  o 


*DF.CK  FTPS  OS  P  3 

SUBROUTINE  TPSOSP2(DT) 

C  FUNCTION-  COMPUTES  THE  MATRIX  RICCATI  EQUATION 
C  PC=(EXP( A*T*PC*EXP( A ' *T ) )  + 

C  +  EXP( A*T ) *1 NTEGR AL( PH I  DT)*EXP(A’*T) 

C  WHERE  PHI  =  FXP( A*T)*0'JE*EXP(-A ' *T) 

C  AND  A'  -  TRANSPOSE  OF  MATRIX  A 

C  INPUT  VALUES: 

C  A:  SYSTEM  SOLUTION  MATRIX. 

C  QUE :  MODEL  ERROR  MATRIX. 

C  PC:  VALUE  OF  COVARIANCE  PROPACATED  OVER  TIME. 

C  DT:  TIME  INCREMENT  IN  UNITS  OF  SECONDS. 

C  NPTSS :  GLOBAL  LEADING  DIMENSION  OF  ALL  ARRAYS. 

C  NPTPC :  THE  ORDER  OF  SUBMATRICES  BEING  MANIPULATED. 

C  OD.QDT,  Pill:  STORAGE  ARRAYS.  WILL  BE  DESTROYED. 

C  OUTPUT  VALUES: 

C  PC:  VALUE  OF  A  PRIORI  COVARIANCE  AFTER  PROPAGATION 

NPTPC, NPTSS, DT, QUE:  INPUT  VALUES  ARE  UNCHANGED. 
A,QD,QDT, PHI:  INPUT  VALUES  ARE  DESTROYED. 

DIME’.'SION  IPVT(S)  , WORK (8) 

DIMENSION  PIII(8,8),PC(8,8),QD(8,8),QDT(8,3) 
DIMENSION  QUE (8, 8) ,A(8,3) 

COMMON/MAIN/NPTPC , NPTSS , PH I , PC ,QD , QDT , QUE , A 
IF(DT  .LE.  0.)  GO  TO  099 
T-DT/3600 
C  TIME  T  IS  NOW  IN  HOUR  UNITS. 

CALL  INTEC( NPTPC , A, T, QUE , PHI , QD, QDT , NPTSS ) 

C  NOW  QDT  =  INTEGRAL  (PHI  DT) 

C  h  -  EXP(-A*T) 

CALL  SGEFA(A(  1,1),  NPTSS ,  NPTPC ,  I  PVT (  I  )  ) 

CALL  SCEDI(A( 1 , l ), NPTSS, NPTPC , IPVT( l ) ,WORK( 1 )) 

C  NOW  A  =  EXP(A*T)  AFTER  MATRIX  INVERSION. 

CALL  M AT4(NPTPC , NPTPC , QDT (1,1),A(1,I),QD(1,I)) 

C  NOW  QD  *  EX  P  (  A*T  )  *  I NTEGR  AL  (PHI  DT)*''.XP(A'*T) 

CALL  MATA (NPTPC, NPTPC, PC( 1,1) ,A( 1,1) ,QDT( 1,1)) 

C  NOW  QDT  =  EXP(A*T)*PC*EXP(A’*T) 

DO  20  1=1, NPTPC 

DO  20  J=I, NPTPC 
iO  PC(I,J)=  QDT ( I , J )+  QD(I,J) 

C  NOW  PC  HOLDS  THE  SUM  OF  BOTH  TERMS  IN 
C  THE  RICCATI  EQUATION. 

999  RETURN 


o  o 


*DF,CK  MEXP2 

SUBROUTINE  MEXP(N,SUBl,TIME,SUB2,Q,QT,N2) 

C  FUNCTION-  COMPUTES  MATRIX  EXPONENTIAL  E  X  P  (  S  U B  1  *T  IM  E  ) 

C  BY  SC  11  UR  METHOD  OF  TRIANGULAR  DECOMPOSITION. 

C  INPUT  VALUES: 

C  SUBl:  MATRIX  TO  BE  EXPONENTIATED. 

C  TIME:  TIME  INCREMENT  IN  AR  ITRARY  UNITS. 

C  SUB2:  STORAGE  ARRAY.  CONTENTS  WILL  BE  DESTROYED. 

C  Q ,  QT  :  STORAGE  ARRAYS . CONTENTS  WILL  BE  DESTROYED. 

C  N2  :  THE  GLOBAL  LEADING  DIMENSION  OF  ALL  ARRAYS. 

C  N:  THE  ORDER  OF  SUBMATR  T  C  ES  TO  BE  MANIPULATED. 

C  OUTPUT  VALUES: 

C  SUB2:  HOLDS  THE  MAIN  RESULT  EXP ( SUB l *T IM E ) . 

C  SUB  1 , Q , QT :  INPUT  CONTENTS  HAVE  BEEN  DESTROYED. 

C  T I M  F. ,  N  ,  N  2  :  INPUT  VALUES  ARE  UNCHANGED. 

DIMENSION  SUB1(N2,N2),SUB2(N2,N2) 

DIMENSION  Q(N2,N2) ,QT(N2,N2) 

C  MULTIPLY  ELEMENTS  OF  SUBl  BY  TIME 
DO  102  1=1, N 

DO  102  J  =1 , N 
SUB  1(1, J)=SUB 1(1, J) ‘TIME 
102  SUB2(I,J)=SUB1(I,J) 

C  GENERATE  IDENTITY  MATRIX  FOR  TNPUT  Q,  FOR  HOR 
DO  3  0  1  =  1  ,  N 

DO  30  J  =  1  ,  N 
Q  (  l ,  J  )  ■  0  . 

30  I F ( I  .EQ.  J)  Q  (  I ,  J  )  » l  . 

CALL  l'QF(N,SUB2,Q,N2) 

C  MATRI . .  SUB2  HAS  BEEN  DESTOYED 

C  Q  IS  NOW  AN  ORTHOGONAL  TRANSFORMATION  MATRIX 
DO  40  1  =  1, N 

DO  40  J  =1  ,  N 
40  QT(I , J)=Q(J, I) 

C  QT  IS  NOW  THE  TR ANS PO S E , AN D  THE  INVERSE,  OF  Q 
CALL  M U LT ( S U B 1 , Q , N , S U B 2 , N 2  ) 

CALL  MULT(QT, SUB2 , N, SUB  1  ,N2  ) 

SUBl  NOW  CONTAINS  THE  TRIANGULAR  MATRIX  QT  * A*Q 
WHERE  ’A’  REPRESENTS  THE  INPUT  VALUE  OF  SUBl. 

DO  50  1  =  1  ,  N 

DO  50  J  =1  ,  N 
50  S  UB  2  (  I  ,  J  )  =0  . 

C  SHR2  IS  SET  TO  A  ZERO  MATRIX  FOR  INPUT  TO  MODULE  FUNCT 
CALL  F U NCT( l , N , SUB  1 , SUB 2 , N 2  ) 

C  SUB2  NOW  HOLDS  EXP(A*TIME)  IN  TRIANGULAR  FORM 
CALL  MULT (SUB  2 , OT , N , SUB  1 , N2  ) 

CALL  MULTIQ , SUB  1 , N, SUB  2  ,N2  ) 

C  SUB2  NOW  HOLDS  F.XP(A*TIME)  IN  ORIGINAL  BASIS  FORM 
RETURN 


n  n 


*DF.CK  INTF.G2 

SUP, ROUT  INF.  INTEG(N ,  A ,  T, QUF. ,  Pll  I ,  QD ,  QDT ,  N2  ) 

C  FUNCTION-  USES  SIMPSON'S  RULE  ON  THE  RICCATt  INTEGRAL. 

C  -  USES  THE  MATRIX  RULE  EXP(A(S+T) )=EXP(AS)*EXP(AT) 

C  SOURCE:  A. it. LEVIS,  COMPUTATIONAL  ASPECTS  OF  THE  MATRIX 
C  EXPONENTIAL,  IEEE  TRANSACTIONS  ON  AUTOMATIC  CONTROL 

C  AUG  1960:410,411. 

C  INPUT  VALUES: 

C  A:  SYSTEM  MATRIX  TO  PF.  EXPONENTIATED. 

C  T:  TIME  INCREMENT  IN  ARBITRARY  UNITS. 

C  QUF.:  SYSTEM  MOREL  ERROR  MATRIX. 

C  PHI,QD,QDT:  STORAGE  ARRAYS.  WILL  RE  DESTROYED. 

C  M2:  GLOBAL  LEADING  DIMENSION  OF  ALL  ARRAYS. 

C  N:  DIMENSION  OF  SUBMATRICFS  TO  P.E  MANIPULATED. 

C  OUTPUT  VALUES: 

C  N,T,N2:  INPUT  VALUES  ARE  UNCHANGED. 

C  A, PHI, QD:  INPUT  CONTENTS  ARE  DESTROYED. 

QDT :  HOLDS  THE  MAIN  RESULT, 

INTECRAL(F.XP(-A*T)*QUF.*EXP(-A'*T)DT) 

DIMENSION  A(N2,N2),QUF.(M2,N2),PHI(N2,N2) 

DIMENSION  QD(N2 , N2 ) ,QDT(N2 ,N2 ) 

T2=*  -T*0. 5 

CALL  MF.XP ( N ,  A(  1 , 1 )  ,T2 ,  PH  I  ( 1 , 1 ) , QD ( l ,  I  )  ,  ODT(  1 , 1 )  ,  N2  ) 
C  NOW  PHI  *  EXP(-A*T/2) 

CALL  T!!I(N , QUE(  1 ,1), PH  1(1,1)  ,QD(  1,1)  ,QDT ( l ,  1 )  , N2 ) 

C  NOW  ODT  *  EXP(-A*T/2)*Ql'E*EXP(-A'*T/2) 

CALL  MSCALE(N, N , ODT ( I , I ) , 4 . O.QDT (1,1)) 

C  NOW  QDT  -  4*EXP(-A*T/2)*QUE*F.XP(-A  **T/2  ) 

CALL  MULT( Pit I(  1 , 1 )  , PHI ( 1 ,  l )  , N,  A(  1 ,  l )  ,N2 ) 

C  NOW  A  *  EXP(-A*T) 

CALL  TRI(N,QUE(l,l),A(l,l),QD(l,l),PHl(l,l),N2) 

C  NOW  PHI  -  EXP(-A*T) *QUE*EXP(-A ' *T) 

DO  10  1=1, N 
DO  10  J  =  I , N 

10  QDT(I,J)*  QUE(I,J)+  QDT ( I , J )+  PHI(I,J) 

C  QDT  SUMMED  THE  THREE  TERMS  OF  SIMPS°N'S  RULE. 

T6=T/6.0 

CALL  M  SC ALE ( M , N , QDT ( 1 , 1 ) , T6 , QDT ( 1 , l ) ) 

C  NOW  QDT  =  QDT*T/6.  SIMPSON’S  RULE  IS  COMPLETE. 

RETURN 

END 


B-5 


*  D  E  C  K  MULT2 

SUBROUTINE  M ULT 2 ( N , X , Y , Z , M2  ) 

C  FUNCTION-  PERFORMS  MATRIX  MULTIPLICATION  Z=X*Y’, 

C  WHERE  Y'  REPRESENTS  THE  TRANSPOSE  OF  MATRIX  Y. 

C  INPUT  VALUES: 

C  X ,  Y :  ARRAYS  THAT  HOLD  MATRICES  TO  BE  MULTIPLIED 

C  7.1  STORAGE  ARRAY.  CONTENTS  WILL  BE  DESTROYED. 

C-  N2 :  THE  GLOBAL  LEADING  DIMENSION  OF  ARRAYS  X,Y,Z 

C  N:  THE  ORDER  OF  THE  SUBMATRICES  TO  BE  MULTIPLIED. 

DIMENSION  X(N2,N2) ,Y(N2,N2) , Z ( N 2 , N 2 ) 

DO  20  I™1 , N 

DO  20  J-l.N 

Z ( I , J ) =  0. 

DO  20  K *1  ,  N 

20  Z(I,J)*  Z(I,J)+X(I,K)*Y(J,K) 

RETUR  N 
END 


* DE CK  TRI 

SUBROUTINE  TR  I  ( N , Q , X , W , Z , N 2 ) 

C  FUNCTION-  PERFORMS  THE  MATRIX  MULTIPLICATION  Z=X*Q*X', 

C  WHERE  X'  REPRESENTS  THE  TRANSPOSE  OF  MATRIX  X. 

C  INPUT  VALUES: 

C  X , Q :  ARRAYS  THAT  HOLD  MATRICES  TO  BE  MULTIPLIED. 

C  W,Z:  STORAGE  ARRAYS.  CONTENTS  WILL  BF.  DESTROYED. 

C  N2:  THE  GLOBAL  LEADING  DIMENSION  OF  ARRAYS  0,X,W,Z. 

C  N:  THE  DIMENSION  OF  SUBMATRICES  TO  BE  MULTIPLIED. 

C  OUTPUT  VALUES: 

C  X , Q , N  2 , N :  INPUT  VALUES  ARE  UNCHANGED. 

C  Z:  HOLDS  THE  MAIN  RESULT  X*Q*X’. 

C  W:  INPUT  VALUES  ARE  DESTROYED. 

DIMENSION  Q(N2,N2),X(N2,N2),W(N2,N2),Z(N2,N2) 

CALL  MULTVX,Q,N,W,N2) 

C  X*0  IS  STORED  IN  W 

CALL  MULT 2 ( N , W  , X , Z , N2 ) 

C  7.  -W  *  X  " 

RETUR  N 
END 


"DECK  MULT 

SUBROUTINE  M ULT ( X , Y  ,  N  ,  Z  ,  N 2 ) 

C  FUNCTION-  PERFORMS  m,\TRIX  MULTIPLICATION  Z*X*Y 
C  INPUT  VALUES: 

C  X , Y :  ARRAYS  THAT  HOLD  MATRICES  TO  BE  MULTIPLIED 

Z  :  STORAGE  ARRAY.  CONTENTS  WILL  BE  DESTROYED. 

N2  :  GLOBAL  LEADING  DIMENSION  OF  ARRAYS  X,Y,Z. 

N:  THE  ORDER  OF  THE  SUBMATRICES  TO  BE  MULTIPLIED 

OUTPUT  VALUES: 

Z:  HOLDS  THE  MATRIX  PRODUCT  X*Y. 

X,Y,N,N2:  INPUT  VALUES  ARE  UNCHANGED. 

DIMENSION  X(N2,N2) ,Y(N2,N2) , Z(N2,N2) 

DO  20  I»I,N 
DO  20  J-l.N 
Z(I, J)-0. 

DO  20  K * l  ,S 

20  Z(I,J)-Z(I,J)+X(I,K)*Y(K,J) 

RETURN 


*DECK  FUNCT 

SUBROUTINE  FUNCT(R,S,T, F.MM) 

C  FUNCTION-THIS  MODULE  COMPUTES  THE  EXPONENT  OF  AN 
C  UPPER  TRIANGULAR  MATRIX  WITH  DISTINCT  EIGENVALUES. 

C  THE  EXPONENTIAL  OF  THE  DIAGONAL  ELEMENTS  IS  COMPUTED 

C  DIR ECTLY. SUPERDTAGONAL  ELEMENTS  ARE  OBTAINED  FROM  THE 

C  LOWER  DIAGONALS  RY  A  RECURRENCE  RELATION. 

C  A  WARNING  IS  PRINTED  IF  MULTIPLE  EIGENVALUES  EXIST. 

C  S01!RCE:  COMPUTATION  OF  FUNCTIONS  OF  TRIANGULAR  MATRICES 
C  BY  B.N.  PARLETT,  MEMO  /'ERL-MARl ,  UNIVERSITY  OF  CALIF. 

C  AT  BERKELEY.  NOV  1974.  ( AD-A  005  916) 

C  INPUT  VALUES: 

C  R:  THE  INDEX  OF  THE  FIRST  ROW  IN  THE  TRIANGULAR  RLOCK. 

C  S:  THE  INDEX  OF  THE  LAST  ROW  IN  THE  TRIANGULAR  BLOCK. 

C  T:  THE  ARRAY  WHICH  CONTAINS  THE  TRIANGULAR  MATRIX. 

C  F:  THIS  ARRAY  CONTAINS  A  ZERO  MATRIX. 

C  MM:  THE  GLOBAL  LEADING  DIMENSION  OF  ARRAYS  T  AND  F. 

C  OUTPUT  VALUES: 

C  R , S , T , MM :  INPUT  VALUES  ARE  UNCHANGED. 

C  F:  CONTAINS  THE  RESULT,  THE  EXPONENTIAL  OF  MATRIX  T. 

DIMENSION  T(MM,MM) ,F(MM,MM) 

INTEGER  R.S 
REAL  EXP 
DO  10  I=R,S 

C  THE  IF-BLOCK  GIVES  14-DIGIT  ACCURACY  WITHOUT  UNDERFLOW 
IF (  T( I , I )  .LT.  -43.)  THEN 
F(I,I)=0. 

ELSE 

F( I , I )=EXP(  T(I , I)  ) 

END  IF 

10  CONTINUE 

C  PROCESS  THE  KTII  SU  PER  DIAGONAL 
N-S-R+l 
NN=N— l 

C  NN  «  NUMBER  OF  SUPERDIAGONALS  IN  THE  BLOCK 
IF(NN  .EQ.  0)  RETURN 
DO  13  K=l,NN 
LL“S-K 

"0  12  I=R , LL 

C  CHECK  FOR  MULTIPLE  EIGENVALUES. 

DIFF=T(I,t)-T(I+K,I+K) 

IF( ABS(DIFF)  .EQ.  0.0)  GO  TO  14 
G=T(  l ,  I +K  )  *(  F(  I ,  I  )-F(  I +K ,  I+K )  ) 

KK=K-1 

1F(KK  .EQ.  0)  GO  TO  12 
DO  II  M=I,KK 

11  G*G+( F( I , l+M) *T( I+M , I+K)-T( I , l+K-M) *F( l+K-M , I+K ) ) 

12  F( I, I+K)=G/DIFF 

13  CONTINUE 
RETURN 


o  n  n  n 


14  PRINT  * , ERROR  IN  MODULE  FUNCT. ' 

PRINT  *, 'TRIAMCULAR  MATRIX  HAS  MULTIPLE  EIGENVALUES. ' 
RETURN 
END 
*DECK  !i(JR 

SUBROUTINE  JIQR  (ICH , H,  Z, N2) 

C  FUNCTION-  CALCULATES  THE  ORTHOGONAL  SCIIUR  TRANSFORMATION 
C  MATRIX  FOR  THE  UPPER  flESSF.NBERC  MATRIX  WHICH  IS  INPUT. 

C  USES  THE  QR  ITERATIVE  METHOD. 

C  PRINTS  A  WARNING  IF  THE  EIGENVALUES  DO  NOT  CONVERGE. 

C  SOURCE:  THE  EISPACK  GUIDE. 

C  INPUT  VALUES: 

C  ICH:  DIMENSION  OF  THE  REQUIRED  TRANSFORMATION  MATRIX. 

C  H:  THE  UPPER  HESSENBERC  MATRIX  TO  BE  TRANSFORMED. 

C  Z:  MUST  BE  AN  IDENTITY  MATRIX. 

C  N2 :  THE  CLORAL  LEADING  DIMENSION  OF  ARRAYS  H  AND  Z. 

C  OUTPUT  VALUES: 

C  ICH.N2:  INPUT  VALUES  ARE  UNCHANGED. 

C  H:  INPUT  VALUES  ARE  DESTROYED. 

C  Z:  GONTAINS  THE  ORTHOGONAL  SCHUR  TRANSFORM  MATRIX. 

INTEGER  I ,  J ,  K ,  L ,  M ,  N,  EM,  1 1 ,  J  J ,  LL  ,i  IM ,  NA ,  NM ,  NH ,  M2 , 

X  ICH,  ITS,  LOW  ,MP2 ,  ENM2 , IF.RR  ,MIMO 

REAL  H(N2,N2),Z(N2,N2) 

REAL  P,Q,R»S,T,W,X,Y,RA,SA,VI,VR,ZZ, NORM 
R EAL  M  ACHEP , SORT , ABS , S ICM , REAL , AIM  AG 
LOGICAL  NOTLAS 
COMPLEX  Z3.GMPLX 


MACHEP  IS  A  PARAMETER  THAT  SPECIFIES  PRECISION 
MACHEP-O. OOOOOOOOOOOOI  . 

NM-ICH 
N-ICH 
LOW-  1 
I  ERR-  0 
NORM-  0. 

X-  l 

C  COMPUTE  ’<ATPIX  NORM 
DO  50  I  =  l,M 
DO  40  J  *  K ,  M 

40  NORM  -  NORM  +  ABS(U(I,J)) 

K  =  I 
50  CONTINUE 
en  =  i on 
T  -  0.0 

C  ***  SEARCH  FOR  NEXT  EIGENVALUES*** 

60  IF(KN  .LT.  LOW)  GO  TO  1001 
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rj  o 


'.'A  »  EN  -  1 
EHM2  ■  NA  -  l 

***L00K  FOR  SINGLE  SMALL  SUB-DIAGONAL  ELEMENT 
FOR  L=EN  STEP  -1  UNTIL  LOW  DO  *** 

70  DO  SO  LL  =  LOW,  EN 
L  *  EN  +  LOW  -  LL 
IF(L  . F.Q.  LOW)  GO  TO  100 
S  =  ABS ( H ( L — I ,L-1 ))  +  ABS(H(L, L) ) 

IF(S  .EQ.  0.0)  S  -  NORM 

IF(ABS(Il(L,L-l ) )  .LE.  MACI1EP  *  S)  GO  TO  100 
80  CONTINUE 
C  ***  FORM  SHIFT  *** 

100  X  *  !!(EN,EN) 

IF(L  .EQ.  EN)  CO  TO  270 

Y  *  H(NA,NA) 

W  *  H(EN,NA)  *  H(NA,EN) 

IF( L  .EQ.  MA)  GO  TO  280 
IF ( ITS  .EQ.  30)  CO  TO  1000 
IF( ITS  .NE.  10  .AND.  ITS  .ME.  20)  GO  TO  130 
C  ***  FORM  EXCEPTIONAL  SHIFT  *** 

T  -  T  +  X 

DO  120  I  -  LOW ,EN 
120  H(I,I)  =■  H(I ,  I)  -  X 

S  -  ABS(H(F,N,NA) )  +  ABS(H(NA,ENM2)) 

X  =  0.75  *  S 

Y  *  X 

W  *  -0.4375  *  S  *  S 
130  ITS  -  ITS  +  l 

C  LOOK  FOR  TWO  CONSECUTIVE  SMALL  SUB-DIACONAL  ELEMENTS 

C  ***  FOR  M-EN-2  STEP  -1  UNTIL  L  DO  *** 

DO  140  MM  *  L,  ENM2 
M  »  ENM2  +  L  -  MM 
7. Z  -  H(M,M) 

R  -  X  -  ZZ 
S  *  Y  -  ZZ 

P  =*  (R  *  S  -  W)  /  H(M+1,M)  +  i1(M,M+l) 

Q  =  H(M+l,M+l)  -  ZZ  -  R  -  S 
R  =  H(M+2,M+l) 

S  =  ABS(P)  +  ABS(Q)  +  ABS(R) 

P  *  P/S 
Q  -  Q/S 
R  -  R/S 

TF(M  .F.Q.  L)  GO  TO  150 

IF(ABS(i!(M,M— 1  ))*(ABS(Q)  +  A^S (R )  )  .LF..  MAG!IEP*ABS(P) 


IT 


X  *  (A3S(!l(M-l,:i-l))+ABS(ZZ)FABS(M(M+l,M+l)))) 

X  GO  TO  150 
140  CONTINUE 
150  MP2  =  M  +  2 

DO  160  I  -  MP2,  EN 
11(1,1-2-)  -  0.0 
IF ( I  .F.q.  MP2)  GO  TO  160 
H(I,I-3)  =■  0.0 
160  CONTINUE 

C  *  DOUBLE  QR  STEP  FOR  ROWS  L  TO  EN  AND  COLUMNS  M  TO  F.N  * 
DO  260  K  =-  M,  NA 
NOTLAS  -  K  .NE.  NA 
IF  (K  .EQ.  M)  GO  TO  170 
P  =•  H(K,K-1) 
q  *  H(IC+l,K-I) 

R  =*  0.0 

IF  (NOTLAS)  R  -  H(K+2,K-1) 

X  *  ABS(P)  +  ABS(Q)  +  ABS(R) 

IF(X  .F.q.  O.'O)  CO  TO  260 
P  *  P/X 

Q  *  q/x 

R  -  R/X 

170  S  -  SlGN(SqRT(P*P+q*q+R*R) ,P) 

IF(K.Eq.  M)  GO  TO  180 
H(K,K-1)  *  -S  *  X 
CO  TO  190 

130  IF(L  .NE.  M)  H(K,K-1 )  =*  -H(K.K-l) 

190  P  *  P  +  S 

X  =*  P/S 
Y  -  q/s 
7.Z  »  R/S 

q  -  q/p 

R  *  R/P 

C  ***  ROW  MODIFICATION  *** 

DO  210  J  =■  K,  N 

P  =•  H(K,  J)  +q  *  H ( K+l ,  J ) 

T F ( . NOT .  NOTLAS)  GO  TO  200 
P  -  P  +  R  *  II(K+2 ,  J) 

!1(K+2,J)  =■  H(K+2,J)  -  P  *  7.7 

200  H(K+l,J)  -  H(K+l,J)  -  P  *  Y 

H(K, J)  -  H(K, J)  -  P  *  X 

i 10  CONTINUE 

.1  *  MIN0(EN,Kf3) 

***  COLUMN  MODIFICATION  *** 

DO  230  I  =  l,  J 

P  -  X  *  11(1, X)  +  Y  *  il(I,K+l) 


C 


I F (  .NOT.  NOTLAS)  GO  TO  220 
P  -  P  +  7.7.  *  H  (  l ,  K  +  2  ) 

H  (  I  ,  K+2  )  =»  H  (  I ,  K+2  )  -  P  *  R 
220  H ( I , K  +  l )  =  H(l,K+l)  -  P  *  Q 

n  (  I  ,  K  )  =  II  (  I  ,  K  )  -  P 
2  30  CONTTNUF. 

***  ACCUMULATE  TRANSFORMATIONS  *** 
00  250  I  =  LOW,  IGH 

P  *  X  *  Z(I,K)  +  Y  *  7.  (I,  K+l) 
IF(  .NOT.  NOTLAS)  GO  TO  240 
P  -  P  +  ZZ  *  7.(1,  K+2) 

Z  (  I ,  K+2  )  =»  Z  (  I  ,  K  +  2  )  -  P  *  R 
240  7(1, K+l)  -  7 (I, K+l)  -  P  *  Q 

Z(I,K)  =  Z ( I , K )  -  P 
250  CONTINUE 

260  CONTINUE 
GO  TO  70 

***  ONE  ROOT  FOUND  *** 

270  H  (  E  N ,  F, N  )  -  X  +  T 
EN  *  NA 
CO  TO  60 

***  TW0  ROOTS  FOUND  *** 

280  P  -  (Y  -  X)/  2.0 
q  »  P*P  +  W 
ZZ  -  SQRT(ABS(Q)) 

H  (  F.  N  ,  E  N  )  *  X  +  T 
X  -  H(EN,F.N) 
ll(NA.NA)  »  Y  +  T 
I F( Q  .LT.  0.0)  GO  TO  320 
***  REAL  PAIR  *** 

7Z  -  P  +  SIGN(ZZ.P) 
a  -  U  (  E  N  ,  N  A  ) 

S  =■  A  8S  (  X  )  +  A  R  S  (  7  Z  ) 

P  -  X  /  S 

Q  »  ZZ  /  S 

R  -  S  QR  T ( P  *  P  +  Q*Q) 

P  -  P  /  R 

Q  -  q  /  R 

***  ROW  MODIFICATION  *** 

DO  290  J  =■  NA,  N 
ZZ  »  H ( N A , J ) 

II  (  N  A ,  J  )  =*  q*7.7  +  P*H(EN,J) 

11  (  E  N  ,  J  )  =■  q  *11  (  E  N  ,  J  )  -  P  *  7  Z 
290  CONTINUE 

***  COLUMN  MODIFICATION  *** 


c  ***  COLUMN  MODIFICATION  *** 

DO  300  I  -  l,  EN 
7.Z  =■  11(1,  NA) 

H(I,MA)  -  Q*ZZ  +  P*H(  I  ,F.N) 

U(I.EN)  =  Q*H(I,EN)  -  P*ZZ 
300  CONTINUE 

C  ***  ACCUMULATE  TRANSFORMATIONS  *** 

DO  310  I  =  LOW ,  ICH 
ZZ  =*  Z(I,NA) 

Z ( I , NA )  *  Q*ZZ  +  P*Z( I , EN) 

Z(l,EN)  -  Q*Z( I , EN)  -  P*ZZ 
310  CONTINUE 
CO  TO  330 

C  ***  COMPLEX  PAIR  *** 

320  CONTINUE 
330  EN  *  ENM2 
GO  TO  60 

C  ERROR  -  HO  CONVERGENCE  TO  EIGENVALUE  AFTER  30  ITERATIONS 

1000  I ERR  =  EN 

PRINT  *, 'EIGENVALUE ' ,IERR, 'DOES  NOT  CONVERGE. ' 

PRINT  *,  'SUGGEST  INCREASING  MAC1IEP  IN  MODULE  HQR  ' 

1001  RETURN 
END 
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on  on  o  o  on  o  o  non  o  noon  o  non 


*DECK  SCEFA 

SUBROUTINE  SGEFA( A,LDA,N, IPVT) 

INTEGER  LDA, N, IPVT (t), INFO 
REAL  A(LDA, I ) 

C 

C  SCEFA  FAGTORS  A  REAL  MATRIX  BY  GAUSSIAN  ELIMINATION 
C  A  WARNING  IS  PRINTED  IF  A  ZERO  EIGENVALUE  IS  FOUND. 

ON  ENTRY: 

A:  THE  MATRIX  TO  BE  FACTORED 

LDA:  THE  GLOBAL  LEADING  DIMENSION  OF  THE  ARRAY  A 
N:  THE  ORDER  OF  THE  MATRIX  TO  RE  FACTORED. 

IPVT: STORAGE  ARRAY.  INPUT  CONTENTS  WILL  BE  DESTROYED 

ON  RETURN 

A:  AN  UPPER  TRIANCULAR  MATRIX  AND  THE  MULTIPLIERS 
WHICH  WERE  USED  TO  OBTAIN  IT. 

IPVT:  AN  INTEGER  VECTOR  OF  PIVOT  INDICES 
THIS  IS  FROM  UNPACK  USER'S  GUIDE, VERSION  08/14/78 
REAL  T 

INTECF.R  ISAMAX,  J  ,K,KPl  ,L,NM1 

GAUSSIAN  ELIMINATION  WITH  PARTIAL  PIVOTTNC 
INF0=  0 
NMl-  N-l 

IF (MM I  .LT.  1)  GO  TO  70 
DO  60  K-l.NMl 
KP1=*  X+L 

FIND  L  =*  PIVOT  INDEX 

L*  I S  AM AX ( N-K+l , A( K , K ) , 1 )+K-i 
IPVT(K)=*  L 

ZERO  PIVOT  IMPLIES  THIS  COLUMN  IS  TR IANGULARIZED 
IF(A(L,K)  .F.Q.  O.OEO)  GO  TO  40 

INTERCHANGE  IF  NECESSARY 
IF(L  .EQ.  K)  GO  TO  10 
T*A(L,K) 

A(L,K)-  A(K,X) 

A(K,K)-  T 
10  CONTINUE 

COMPUTE  MULTIPLIERS 
T=*  -l  . OF.O/A(K,K) 

CALL  SSCAL(N-K,T,A(K+1 ,K),1) 


C  ROW  ELIMINATION  WITH  COLUMN  INDEXING 
DO  30  J  =K  P 1  ,  N 
T  ■  A  (  L  ,  J  ) 

I F ( L  . EQ.  K)  CO  TO  20 
A( L , J  )  *  A(K,J) 

A(K,J)=*  T 
20  CONTINUE 

CALL  SAXPYlN-K ,T, A(K  +  l , K) , 1 , A( K  +  l  ,  J  )  ,  1  ) 

30  CONTINUE 
GO  TO  50 
40  CONTINUE 
I N  F  O  =*  K 
50  CONTINUE 
60  CONTINUE 
70  CONTINUE 
IPVT(N)=*  N 

I  F  (  A  (  N  ,  N  )  .EQ.  O.OF.O)  INFO=  N 
I F ( INFO  .HE.  0)  THEN 

rRINT  *, 'EIGENVALUE  ',INF0,’=0  IN  MODULE  SC.EFA.  ' 
PRINT  *  ,  'THIS  CAUSES  MODULE  SCEDI  TO  DIVIDE  B Y  0. 
END  IF 
RE  TUP  N 
E  N  D 

*  DECK  SCEDI 

SUBROUTINE  SCEDI(A,LDA,N, I  PVT .WORK) 

INTEGER  LDA.N, IPVTf l) 

REAL  A (  L DA  ,  I  )  , W OR K (  l  ) 

C 

C  SGEDI  COMPUTES  INVERSE  OF  MATRIX  A  USING 
C  FACTORS  COMPUTED  BY  SGF.FA. 

C 

C  ON  ENTRY: 

C  A:  THE  OUTPUT  FROM  SGF.FA,  REAL(LDA.N) 

C  LDA :  THE  LEADING  DIMENSION  OF  ARRAY  A 
C  N:  THE  ORDER  OF  MATRIX  A 

C  IPVT:  THE  PIVOT  VECTOR  FROM  SCEFA, INTEGER(N) 

C  WORK:  WORK  VECTOR  ,  CONTENTS  DE  STR  OY  E  D  ,  R  F,  AL  (  N  ) 

C 

C  ON  RETURN: 

C  A:  INVERSE  OF  THE  OR IC INAL  MATRIX 

C 

C  ERROR  CONDITION:  A  DIVISION  BY  7.ER0  WILL  OCCUR  IF  THE 
C  INPUT  FACTOR  CONTAINS  A  ZERO  ON  THE  DIAGONAL. 

C  IT  WILL  NOT  OCCUR  IF  SCEFA  HAS  SET  INFO*0 


C  THIS  IS  FROM  LINPACK  USER’S  OU  IDF,  .VERSION  08/14/78 
REAL  T 

INTEGER  I, J,K,KB,KP1,L,NM1 
C 

C  compute  inverse 

00  100  K-l.H 
A  (  K  ,  K  )  =»  l.OEO/A(K,K) 

T 3  ~A( K , K) 

CALL  SSCAL(K-1 ,T,A( 1 ,K) , 1 ) 

KP 1  *  K  +  l 

I F ( N  .LT.  KP1)  GO  TO  90 
no  80  J  =*K  P  l  ,  N 
T-  A ( K , J ) 

A(K,J)=  O.OF.O 

CALL  SAXPY(K,T, A(1 ,K) , 1 , A( l , J) ,1 ) 

80  CONTINUE 
90  CONTINUE 
100  CONTINUE 
C 

C  FORM  INVERSE(U) *1 NVER  SE ( L ) 

NM 1 »  N-l 

I F ( NM  1  .LT.  1)  GO  TO  140 
DO  130  K  B  =  1 , NM 1 
K=>  N-KB 
KP  l  =  K+l 
DO  110  I =KP l , N 
WORK(I)*  A ( I , K ) 

A ( I , K ) ■  O.OEO 
110  CONTINUE 

DO  120  J-KPl.N 
T»  WORK(J) 

CALL  SAXPY(N,T. A( 1 , J) , 1  ,  A(  1  , K) , l ) 

120  CONTINUE 
L»  IPVT(K) 

I F ( L  .NE.  K )  CALL  S SW AP ( N , A ( l , K ) , 1 , A( 1 , L  )  , 1  ) 

130  CONTINUE 
140  CONTINUE 
RETURN 
END 

*  DECK  ISAMAX 

INTEGER  FUNCTION  I  S  AM  AX  (  N  f  S  X  ,  I  NC  X  > 

C  ISAMAX  FINDS  INDEX  OF  ELEMENT  WITH  MAX.  ABSOLUTE  VALUE. 
C  LINPACK  USER’S  GUIDE,  VERSION  03/11/78 
REAL  SX(1),SMAX 
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INTEGER  I , I NCX  ,  IX , N 
ISAMAX-  0 

I F ( N  .LT.  1)  RETURN 
ISAMAX-  1 

IF( N  .EQ.  1)  RETURN 
IF( I NCX  .EQ.  1)  GO  TO  20 
C 

C  CODE  FOR  INCREMENT  NOT  F.OUAL  TO  1 
IX-  1 

SMAX-  ABS ( SX( 1 ) ) 

IX-  IX+INCX 
DO  10  1-2, N 

IF(ABS(SX(IX)).LE.SMAX)  GO  TO  5 

ISAMAX-  I 

SMAX-  ABS (SX(IX)) 

5  IX-  IX+INCX 
10  CONTINUE 
RETURN 
C 

C  CODE  FOR  INCREMENT  EQUAL  TO  l 
20  SMAX*  ABS ( SX ( 1 ) ) 

DO  30  1-2, N 

IF(ABS(SX(I)) .LE.SMAX)  CO  TO  30 

ISAMAX-  I 

SMAX-  ABS ( SX ( I ) ) 

30  CONTINUE 
RETURN 
END 

♦DECK  SAXPY 

SUBROUTINE  S AXP Y ( N , S A , SX , I NCX , S Y , I NCY ) 

C  CONSTANT  TIMES  A  VECTOR  PLUS  A^VECTOR. 

C  USES  UNROLLED  LOOP  FOR  INCREMENTS-  1. 

C  FROM  UNPACK  USER'S  GU I  DE  ,  VER  S  I  ON  03/11/78 
REAL  SX( l ) ,SY(l ) , S A 
INTEGER  I , I NCX, INCY, IX, IY,M ,MPl ,N 
T F ( N  .LE.  0)  RETURN 
I F( S A  .EQ.  0.0)  RETURN 

I F ( I NCX  .EQ.  I  .AND.  INCY  . F 0 .  1)  GO  TO  20 
C 

C  CODE  FOR  UNEQUAL  I  NCR  F.M  ENTS  OR  FOR 
C  EQUAL  INCREMENTS  NOT  EQUAL  Tn  l 

IX-  1 

TY-  l 

I F ( INCX.LT.O)  IX-  (  —  N+l ) *  1 NCX  +1 
IF( INCY.LT.O)  IY-  (-N+1)*INCY  +1 


an  o  a  a  n 


DO  10  I-l,N 

SY(IY)-  S Y ( I Y )+  S  A*SX ( I X  ) 

IX*  IX+INCX 
I Y  *  IY+INCY 
10  CONTINUE 
RETURN 

C 

C  CODE  FOR  ROTH  INCREMENTS  EQUAL  TO  l 

C  CLEAN-UP  LOOP 

20  M*MOD(N, 4 ) 

I F ( M  .EQ.  0)  GO  TO  40 

DO  30  I *1 , M 

SY ( I  )  *  SY(lj+  S A*SX( I ) 

30  CONTINUE 

IF( N  .LT.  4)  RETURN 
40  MPl*  M+l 

DO  SO  I  *M  P  1 ,  N  ,  4 
S  Y (  I  )  =  S Y( I  )  +  S  A* S X  (  I  ) 

S Y ( I +1 ) *  S Y ( I +1 )+  S A*SX ( 1+1 ) 

S Y ( I +2 ) *  S Y ( 1+2 )+  S A* S X ( 1+2 ) 

SY( 1+3 )*  S Y ( 1+3 )+  SA*SX( 1+3 ) 

50  CONTINUE 
RETURN 
END 

*DECK  S  SC AL 

SUBROUTINE  S SC AL ( N , S A , S X , I NC X ) 

SCALES  A  VECTOR  BY  A  CONSTANT. 

USES  UNROLLED  LOOPS  FOR  INCREMENT  EQUAL  TO  1 
UNPACK  USER'S  GU I  DE  ,  VER  S  ION  03/11/78 

REAL  S A, SX( 1 ) 

INTEGER  I , I NCX , M , M  P l , N , N INC  X 
I F ( N  .LE.  O)  RETURN 
IF (I NCX  .EQ.  1)  GO  TO  20 

CODE  FOP.  INCREMENT  NOT  EQUAL  TO  1 
NINCX-  N*I NCX 
DO  10  l-l , NINCX, INCX 
SX(I)-  S A*SX ( I ) 

10  CONTINUE 
RETURN 

C 

C  CODE  FOR  INCREMENT  EQUAL  TO  1. 

C  CLEAN-UP  LOOP 

20  M-  MOD(N, 5) 


IF ( ! !  .EO.  0)  GO  TO  40 
DO  30  I-1,M 

SX(I)-  SA*SX( I ) 

30  CONTINUE 

IF(N  .LT.  5)  RETURN 
40  MPl-  M+I 

DO  50  I=MP1,N,5 

SX(I)-  SA*SX( I ) 

SX(I+1)-  SA*SX(I+1 ) 

SX(I+2)-  SA*SX(  1+2) 

SX(I+3)-  SA*SX(I+3) 

SX( 1+4 )-  SA*SX(I+4) 

50  CONTINUE 
RETURN 
END 

*DECK  SSWAP 

SUBROUTINE  SSWAP(N,SX, INCX.SY, INCY) 

C  INTERCHANGES  TOO  VECTORS.. 

C  USES  UNROLLED  LOOPS  FOR  INCREMENTS  EQUAL  TO  1. 
C  LIHPACK  USER'S  GUIDE, VERSION  03/11/7B 
C 

REAL  SX(l),SY(l), STF.MP 

INTEGER  I, IMCX, INCY, IX, IY,M,MPl ,N 

IF(N  .LE.  0)  RETURN 

IF( INCX  .EQ.  1  .AND.  INCY  .EQ.  1)  GO  TO  20 
C  CODE  FOR  UNEQUAL  INCREMENTS,  OR  EQUAL 
C  INCREMENTS  NOT  EQUAL  TO  l. 

IX-  l 
IY-  1 

IF (INCX  .LT.  0)  IX-  (-N+I )*INCX+l 
IF( INCY  .LT.  0)  IY-  (-N+1 )*INCY+l 
DO  10  I-l.N 
STEM P*  SX(IX) 

SX(IX)-  SY(IY) 

SY(IY)-  STEMP 
IX-  IX+INCX 
IY-  IY+INCY 
10  CONTINUE 
RETURN  ' 


CODE  FOR  BOTH  INCREMENTS  EQUAL  TO  1 
CLEAN-UP  LOOP 
20  M-  M0D(N,3) 

IF ( M  .EQ.  0)  CO  TO  40 
DO  3  0  I  =*1  ,M 
STEMP-  SX(I) 

SX(I)-  S Y ( I  ) 

S Y ( I )  =  STEMP 
30  CONTINUE 

IF ( N  .LT.  3)  RETURN 
4  0  M  P I -  M  +  l 

DO  50  I-MPI.H.3 
STEMP-  SX(I) 

SX(I)»  S  Y ( I ) 

S Y ( I ) -  STEMP 
STEM  P-  SX(I+l) 

SX(I+1)-  S Y ( I  +  1 ) 

SY(I+l)«  STEMP 
STEM  P-  SX( 1  +  2 ) 

SX( 1+2 )-  S Y(  I  +2  ) 

SY(I+2)-  STEMP 
50  CONTINUE 
RETURN 
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